I have a need for email archiving. Your EF product is 95% of what I need. Using Apple’s Mail, there is no reasonable method to archive or export email (and attachments) as plain text files (with the attachments exported at their own file format). Heck, Apple hasn’t yet seen fit to even offer archiving or exporting at all
Importing from Mail to your EF product solves the archiving issue – beautifully, I might add.
Is it within the realm of possibility to add in the ability to export from an archive individual emails (or whole folders of them) and store the export as txt files (with the appropriate attachment files saved individually as their own format)?
EagleFiler stores the messages in their native format, grouped together into one mbox file per mailbox. This allows compact storage and efficient browsing without any loss of data, and you can import the mail into any other program that supports mbox. If you need to access the text of a particular message, you can drag and drop it to create a clipping file or copy and paste into TextEdit to create a text file. The attachments are available by double-clicking the message. It’s within the realm of possibility to add some kind of a bulk text-file export, but I have no idea why one would want that. Could you tell me what you’re trying to accomplish?
First off – thanks for the (extremely) timely reply.
Ultimately, my customer’s use is to provide text files to a website archive — quickly providing a Google-able collection of information. I can see where text files serve as the best lowest-common-denominator file format for this use, as well as any other one can think of where the text content of several thousand emails can be accessed by ANY editor. Wouldn’t have to worry that something can parse or interpret mbox or any other formatted output.
I can see where this could be extrapolated to generate text file libraries for company needs – certaintly not something that would be put on the web, but with legal requirements to store ALL email that crosses a mail server at an enterprise or school, indefinitely – this technique would offer an option for semi-permanent record storage and retrieval that would be unparalleled in ease-of-access regardless of what mail clients (and server technology) may be employed in the future. We’ve now got some crazy laws for information preservation in the States. Much of this law is impossible with which to reasonably comply (is there a tax break for all the storage requirements? A federal grant to all school districts for the optical or electro-mechanical storage purchases necessary? Is there a common criteria defined for file structures or formats? Well, I digress. A tool that could store and export as individual text files something scalable from several to several hundred thousand messages->files would be pretty slick.
Any tool that can address multiple needs by storing information in a non-proprietary format accessable by all, with flexibility in export options by the user, is the smartest technique to employ. Modern IT ‘experts’ fail to recognize this need – pushing for adoption of currently-popular or flavor-du-jour formats that CANNOT be read as simple ASCII is simply not the best option for widest possible access and use down the road.
Exporting to text files
In follow up, now I wonder about arcain issues: file naming, for instance. Would it be better to be able to specify that the file name is some kind of sequencial method (exportedtext0001.txt, exportedtext0002.txt), or would it make more sense to use, in this example, the subject of an email (title is what a subject is called after import into EF from what I see)? And if so, would we want to be able to have limits (or set them) for length of filename? How would an export tool handle more than one record that may have identical subject lines/titles? Would it automatically append a serialization to such occurrances? Same for extracted attachments?
You are far more the coder than I’ll ever be – but I know my esoteric questions have to be considered at some point.
OK – I’m not helping . I’ll let you ponder this and see if it’s worth exploring.
I am planning an “Export Web Site” feature that would export a hierarchy of files/folders/e-mails as a series of linked Web pages. I think this is close to what you want. Do you really need to have the e-mails exported as text files, or would HTML be OK? I want to use HTML so that I can include the (rich text) notes and include links to the attachments and the containing folder.
Exporting to text files
I haven’t given it much thought as to HTML vs text. I like text because it is cleaner and simplier, but it is also just as easy to bulk post-process to strip out tags (I do love BBEdit) – and – HTML can be googled just as easily so if the output files were to end up on the web, the format difference wouldn’t matter to me.
Heck, a lot of the email this client wants to archive has HTML thick through it. From the yucky Outlook-I’ll-shove-styles-down-your-throat to simplier colored text to text-only – it’s all likely in the current user’s folder. Yes, I can see where exported-as-HTML would be just as useful.
After thinking about this, and trying out some of the alternatives, I think I’d go back to my original proposition of having an option for text file output. Not only would I use EagleFiler for email archiving, but there may be libraries of code, text copy from books and articles and a whole lot more I’d rather see in its original format, rather than altered to HTML.
I’m not saying that HTML as an additional option isn’t a great idea – it is. But preserving the integrity of the original in its original format would be the best workable option. My client has somewhat forgotten that we need to do this, but that will come around again. Right now, EagleFiler is my choice, but they were leaning toward MailSteward for its output option (and splitting the delimited file output as needed).
Just some thoughts for this if it makes it into a future release candidate.
Oh, certainly the original files would be preserved and linked in, but the question is how to display e-mails. For them, the original format is either proprietary or mbox, neither of which can be viewed directly in a browser. For viewing (or indexing), you wouldn’t want the raw RFC 822 source; you’d either want a text representation or HTML. With the mbox files still available, is there a reason that you would still prefer text?
There may be times having the option to export any email in a simple ‘just text’ format would be preferable. If the email came as simple text, it’s obvious. But for emails that contain a lot of (unnecessary) formatting, the content is key and not the pretti-ness of it all.
For example, lawyers don’t always give a damn about what is bold or italic or in bright pink verdana. They just want the words with little more than accurate retention of line breaks for readability. This is far more important for text retrieval needs if they employ an in-house index that may not be as smart as a google device than using a mix of html and text-only solutions.
I can see if they take adopting EagleFiler further (than just as an email archive), they would have a mix of other text copy stored on the system, perhaps from a wide variety of sources, and want to have the option of exporting case notes, source reference materials, scanned/OCR’d prior judgement info, etc. for storing and possible text export.
So this tool begins to go beyond just text email vs. html email, and a lot of their preferred reading tools might not be browser-based, but have to be read down the road into word perfect, neo-office, etc. Stripping out ‘added’ html tags would then become a significant pain.
I know I originally painted a picture of just exporting so that a private LAN web archive could be built from email records. This was in part to comply with a facility’s needs for such silliness as the proposed ‘SAFETY’ act and real legal requirements at the state and fed level to retain any electronic communications generated by public institutions. Now, the potential usefulness of this technique to other aspects of record retention in a database AND individual file copies of the same in a readable format for redundancy as well as local LAN googling comes into the picture. I have more than one client who likes the idea, but no single tool that does it well. I can use some of the email-to-filemaker solutions, then export filemaker records, then split. Or use that other mentioned product. Neither of these are close to ideal for a point-n-click universe. Your product does export individually ('though as emlx for the test emails I’ve used ).
Kinda crazy where this could all lead, but enticing options for EagleFiler.