Just received a new ScanSnap scanner and would prefer to use my EagleFiler application vs EverNote. I was wondering if anybody out there has an applescript code written for ABBYY Fine Reader to import automatically into an EagleFiler database?
I read the FAQ and seen the PDFpen code.
I’m really a newbie when it comes to Apple Script codes. Any help would be greatly appreciated.
Hi there. Still no response so I’m thinking nobody has written a script at this point.
So here is what I have accomplished so far and what I am trying to do.
I want to make scanned PDF’s from my S1300 searchable in EagleFiler.
In other words, add that text layer to the PDF so that I can search in the scanned text. I’m using the supplied ABBYY Fine Reader. However when I scan a document into the import folder, the text is not searchable, it doesn’t work. Here are my settings:
In the SnapScan Manager “Settings” field, I have “Scan to Searchable PDF” selected as the application. On the “Save” tab I have Documents/EagleFiler Library/To Import (xxx)" selected. My thought is that EagleFiler will automatically import these scanned documents when its restarted.
Is there something I’m doing incorrect? Any help would be greatly appreciated!
When I selected a bit of text to search, I had Search within FILENAME selected instead of ANYWHERE
Michael I’m sorry about this. It was my mistake.
As for your question re: the differences between FineReader Express, I’m not sure.
The ScanSnap S1300 ships with FineReader V4.1 now. Its character recognition is truly amazing. Perhaps as people continue to buy the ScanSnaps, it might make it worth your while to write an AppleScript for this. HINT HINT
If there is anything I can do to assist you, I would be more than willing.
I guess my understanding is that the applescript would automatically add the files to the library whereas right now, the “scan to folder” add it to the folder to be imported. It then gets added either when EagleFiler is started or when it arbitrary scans the “To Be Imported” folder?
Yeah, not much difference for this particular workflow. Actually, it’s probably better to target a folder inside the “Files” folder rather than the “To Import” folder. This way you have control over when it gets imported, so you avoid the possibility of EagleFiler trying to read the file while ABBYY is working on it.
When I have a chance I’ll take a look at FineReader and see if there’s a way to script it to do stuff like this.
A script to use ABBYY FineReader to OCR PDFs from your ScanSnap
In case it helps, I developed a little bash script to automate OCRing a PDF PRIOR to placing it into EagleFiler. I’m sure you can modify it to suit your needs AFTER the file has been imported into EagleFiler. Sadly, ABBYY FineReader (I’m running version 4.1) is not directly scriptable (it does not support AppleScript).
#! /bin/bash
# First test to see if the document has already been OCR'd
if ! grep Font "$1"
then
# Open ABBYY and start to OCR the PDF
open -a 'Scan to Searchable PDF.app' "$1"
# Wait for the completed file to appear in the Finder
while ! -e "${1%.pdf} processed by FineReader.pdf" ]; do
sleep 5
done
sleep 5
# Remove " processed by FineReader" from the file's name
mv -f "${1%.pdf} processed by FineReader.pdf" "$1"
# Tell Finder to hide ABBYY FineReader using Applescript
osascript -e 'tell application "System Events" to set visible of process "Scan to Searchable PDF" to false'
fi
If the PDF did not originate from your ScanSnap, ABBYY FineReader will not be able to OCR it; the “Creator” of the PDF needs to be “ScanSnap Manager”. You’ll need to perform an additional trick upfront if you’d like to OCR PDFs that did not originate from your ScanSnap. Let me know if you’d like to know how that’s done.
Found this old thread and managed to get this working within Eaglefiler for Abby Finereader using a combination of the EagleFiler instructions for PDFPen and the Hazel Instructions for Finereader.
This script does the trick for me, add to your Scripts Folder and execute from within Eagle Filer.
on run
tell application "EagleFiler"
set _records to selected records of browser window 1
repeat with _record in _records
set _file to _record's file
my ocr(_file)
tell _record to update checksum
my removeTag(_record, "NeedsOCR")
end repeat
end tell
end run
on open _files
my ocrAndImport(_files)
end open
on adding folder items to _folder after receiving _files
my ocrAndImport(_files)
end adding folder items to
on ocrAndImport(_files)
repeat with _file in _files
my ocr(_file)
end repeat
tell application "EagleFiler"
import files _files
end tell
end ocrAndImport
on ocr(_file)
using terms from application "FineReader"
set langList to {German, English}
set saveType to single file
set exportmodepdflayout to "text over the page image"
set keepPageNumberHeadersAndFootersBoolean to yes
set keepImageBoolean to yes
set imageOptionsImageQualityEnum to balanced quality
set usemrcboolean to no
set makepdfaboolean to yes
set pageSizePageSizeEnum to automatic
set increasePaperSizeToFitContentBoolean to yes
end using terms from
tell application "FineReader"
# open _file as alias
# tell document 1
export to pdf _file from file _file
# end tell
tell application "FineReader"
quit
end tell
WaitWhileBusy()
end tell
end ocr
on removeTag(_record, _tagName)
tell application "EagleFiler"
set _tags to _record's assigned tags
set _newTags to {}
repeat with _tag in _tags
if _tag's name is not _tagName then
copy _tag to end of _newTags
end if
end repeat
set _record's assigned tags to _newTags
end tell
end removeTag