C-Command Software Forum

How to go from S510M to EagleFiler without me

I’d like to scan in bills and such and have them show up in EagleFiler OCR’d without me having to do anything (after setting up the system) and without having to wait between scans.

Does anyone have a system like this and is willing to share the details on how to set it up?

What OCR software are you using?

I set my ScanSnap to save the scans into EagleFiler’s “To Import” folder. That way there is no waiting between scans, and the PDF files are automatically imported when I open EagleFiler. Later, when I’m reading/processing the files, I open them in PDFpen, run the OCR, and save them in place. It looks like it will be possible to automate this soon.

I have ReadIris and Abbyy FineReader that came with the ScanSnap. I also have Acrobat.

I’ll use anything, and even spend a bit of money to to make this completely automated.

My understanding is that with the S510M there’s an option in the ScanSnap software to OCR with ABBYY after scanning. That might be the easiest way for you.

With the software for my S500M, I can choose “Scan to Searchable PDF” as the application. I set the “FineReader for ScanSnap Preferences” to “Delete scanned images after recognition.” This results in OCRed PDFs automatically saved into EagleFiler’s “To Import” folder. The catch for doing it this way is that you don’t want to have the library open in EagleFiler while you’re scanning, because otherwise it might import the initial PDF before FineReader has created the new OCRed one.

I wrote a script to scan with a S510M, OCR with Acrobat and add to EagleFiler. It does prompt for a filename, which means it’s not unattended, but it should be pretty easily adaptable. It’s attached.

There are two issues involved, naming the files and OCR’ing without waiting for each to finish before scanning the next sheet.

I use the S500M, scan to file in a desktop folder (I hadn’t thought of scanning to the import folder for EF) using the auto filenaming scheme of the S500M. This way I can scan in many bills, etc. without doing anything other than feeding the paper into the scanner.

Once done, I have the manual part - in a browser, I change the names of each file to correspond to my “template” (e.g. 09-12-30 BuyCom Visa). Then I open Acrobat Pro and do a mass OCR on all the files at once. I used to scan-rename-OCR and that slowed me down due to the OCR, but the mass OCR means I just sit back for all the files to be massaged.

Due to my file naming scheme, I don’t see a way to automate this step. For the mass OCR, that does involve talking to Acrobat Pro.

I just tried scanning directly into the “To Import…” folder of the desired EF database with my scanner and got two files scanned and then OCR’ed with Acrobat. When I opened the database, I only got one “unread” file. I am unable to have EF find the second file in the Import folder using the smart folder Recently Added, or untagged. A search of the full records doesn’t find it either.

I’ve only used the To Import folder a few times in the past, and have always had EF find the files added there. But not this time. Am I missing anything? I can see the file in finder in that import folder.

INTERESTING: I just added another file to the import folder while EF was up and running with the database and within a second or so, the new file appeared under the “unread” tag. Still the other file is unfound. I thought perhaps the file name was the problem: it’s “09-12-28 Verizon.pdf”. But I changed it, that didn’t work. I then moved it to the desktop and later back to the import folder. Still not found. It’s 1.5 MB big, perhaps that’s the problem. So I moved the file back to the desktop and did a manual import and that worked!

I have no idea why EF was not able to do an auto import of the file from the “To Import (FilingCabinet)” folder in the database folder.

EagleFiler ignores a file in the “To Import” folder if it detects that another application has it open. This is to prevent importing a file that’s in an incomplete state.

Hmm… I had two files placed in the import folder at the same time using the same method; one was imported by EF, the other was not. I’ll have to keep an eye on this to see if it happens again.

Try enabling “allow duplicates:” in EF’s preferences

That has been activated since I first began using EF.

I’ve posted an OCR With PDFpen script.