C-Command Software Forum

OCR script with ABBYY FindReader

Hello,

Just received a new ScanSnap scanner and would prefer to use my EagleFiler application vs EverNote. I was wondering if anybody out there has an applescript code written for ABBYY Fine Reader to import automatically into an EagleFiler database?

I read the FAQ and seen the PDFpen code.

I’m really a newbie when it comes to Apple Script codes. Any help would be greatly appreciated.

Thanks in advance

Hi there. Still no response so I’m thinking nobody has written a script at this point.

So here is what I have accomplished so far and what I am trying to do.
I want to make scanned PDF’s from my S1300 searchable in EagleFiler.
In other words, add that text layer to the PDF so that I can search in the scanned text. I’m using the supplied ABBYY Fine Reader. However when I scan a document into the import folder, the text is not searchable, it doesn’t work. Here are my settings:

In the SnapScan Manager “Settings” field, I have “Scan to Searchable PDF” selected as the application. On the “Save” tab I have Documents/EagleFiler Library/To Import (xxx)" selected. My thought is that EagleFiler will automatically import these scanned documents when its restarted.

Is there something I’m doing incorrect? Any help would be greatly appreciated!

EagleFiler checks the “To Import” folder often, not just when it’s restarted. I suppose there are two possibilities:

  1. EagleFiler is importing the file after it’s saved in that folder but before ABBYY has added the text layer.
  2. ABBYY is not adding the text layer.

So why don’t you try this when the library isn’t open in EagleFiler, look at the PDFs, and then you’ll be able to see whether ABBYY is doing its job.

Thanks for a response. Much appreciated!

When I selected a bit of text to search, I had Search within FILENAME selected instead of ANYWHERE

Michael I’m sorry about this. It was my mistake.

As for your question re: the differences between FineReader Express, I’m not sure.
The ScanSnap S1300 ships with FineReader V4.1 now. Its character recognition is truly amazing. Perhaps as people continue to buy the ScanSnaps, it might make it worth your while to write an AppleScript for this. HINT HINT :slight_smile:

If there is anything I can do to assist you, I would be more than willing.

Thanks again for your help.

If the search is already working (indicating that the PDF contains a text layer), what is it that you need an AppleScript to do?

I guess my understanding is that the applescript would automatically add the files to the library whereas right now, the “scan to folder” add it to the folder to be imported. It then gets added either when EagleFiler is started or when it arbitrary scans the “To Be Imported” folder?

No real difference I guess correct?

Yeah, not much difference for this particular workflow. Actually, it’s probably better to target a folder inside the “Files” folder rather than the “To Import” folder. This way you have control over when it gets imported, so you avoid the possibility of EagleFiler trying to read the file while ABBYY is working on it.

When I have a chance I’ll take a look at FineReader and see if there’s a way to script it to do stuff like this.

A script to use ABBYY FineReader to OCR PDFs from your ScanSnap
In case it helps, I developed a little bash script to automate OCRing a PDF PRIOR to placing it into EagleFiler. I’m sure you can modify it to suit your needs AFTER the file has been imported into EagleFiler. Sadly, ABBYY FineReader (I’m running version 4.1) is not directly scriptable (it does not support AppleScript).


#! /bin/bash

# First test to see if the document has already been OCR'd
if ! grep Font "$1"

then

	# Open ABBYY and start to OCR the PDF
	open -a 'Scan to Searchable PDF.app' "$1"
	
	# Wait for the completed file to appear in the Finder
	while  ! -e "${1%.pdf} processed by FineReader.pdf" ]; do
		sleep 5
	done
	sleep 5
	
	# Remove " processed by FineReader" from the file's name
	mv -f "${1%.pdf} processed by FineReader.pdf" "$1"
	
	# Tell Finder to hide ABBYY FineReader using Applescript
	osascript -e 'tell application "System Events" to set visible of process "Scan to Searchable PDF" to false'

fi

If the PDF did not originate from your ScanSnap, ABBYY FineReader will not be able to OCR it; the “Creator” of the PDF needs to be “ScanSnap Manager”. You’ll need to perform an additional trick upfront if you’d like to OCR PDFs that did not originate from your ScanSnap. Let me know if you’d like to know how that’s done.