C-Command Software Forum

How to convert webarchives to PDF (Single Page)

Problem: For research purposes, I’m would like to use EagleFiler to back up page copies of a list of URLs (browser bookmarks) in two formats: webarchive and PDF (single page).

Question: Is there an easy way to modify this Web Archive to PDF script so that it outputs a PDF (Single Page) rather than a paginated PDF? I can see by reading the script that it is using something called efweb2pdf and that the is part of the “Wash Framework,” but I’m not sure what parameters it takes or if this is easy to change.

Background:

Right now, I can workaround by importing a bookmark list twice – once by setting EagleFiler preferences to “Web page format > Web Archive”, the using “File > Import Bookmarks”, then again, changing preferences to “Web page format > PDF (Single Page)” and running “File > Import Bookmarks”. This works, but it assumes that my list is reasonably short and that not much has changed on my pages in a few minutes / hours, however in the case of online conversations, news sites, feeds, and sites with rotating content, this process creates two documents which have substantial differences.

My attempts so far: Download webarchives, then make a copy to PDF using AppleScript. I’ve tried modifying the excellent Web Archive to PDF script. I’ve already made a trivial changed so that it doesn’t trash the original, but the results don’t meet my needs: the script outputs to a paginated PDF, which is no good – the content I’m archiving must be in long single page PDFs.

Is there a way to modify it for outputing single page PDFs? Or perhaps I should be approaching the problem in some another way? If you were to make my problem generic, it would be “importing to two formats” – which I’m assuming is something that is outside the scope of how EagleFiler works and needs to be scripted, but perhaps I’m wrong and it is already supported by the capture framework.

Thank you for any suggestions.

– Jeremy

Related links / discussions:

Yes, you could change the line that says:

set _script to _web2pdf's quoted form & " print "

to:

set _script to _web2pdf's quoted form & " screen "

Thank you so much. I’ve incorporated this change into several modified scripts and they work perfectly.