I’m trying to write a bit of code to automatically import Pinboard bookmarks tagged ‘work’ to EagleFiler. This seems to be working all right:
import pinboard
import datetime
import subprocess
work = pb.posts.all(tag=["work"])
for bm in work:
tagnames = ', '.join(map(lambda x: '"'+ str(x) + '"', bm.tags))
script = '''
tell application "EagleFiler"
try -- ignore duplicates
set {{_record}} to import URLs "{url}" Web page format Web archive format
set the title of the _record to "{title}"
set the basename of the _record to "{title}"
set the note text of the _record to "{description}"
set the assigned tag names of the _record to {{{tags}}}
set the creation date of the _record to date "{date}"
end try
end tell
proc = subprocess.Popen(['osascript', '-'],
stdout_output = proc.communicate(script.format(url=bm.url,
title=bm.description, description=bm.extended, tags=tagnames, date=bm.time.strftime('%A, %B %-m, %Y at %H:%M:%S %p')))
The idea is that I would run this code from time to time to import my work bookmarks. One issue with the above is that I’d prefer to use Web archives for the import, but I noticed that after repeated runs EagleFiler does not detect duplicates. This is not a problem if I choose bookmark as the import format.
Yes, EagleFiler determines duplicates by file content. So a bookmark with the same title and URL will be considered equal (since that’s what the file stores). Web archives will always be considered unique (since they contain HTTP request data that is different each time).
In the future, I will probably make it an option to determine duplicates by URL. For now, perhaps you could adapt your script to check the source URLs of the existing records before importing.
Thanks @Michael_Tsai, the code below seems to be working. I’m not an expert on either python nor AppleScript so corrections are welcome:
import pinboard
import datetime
import subprocess
pb = pinboard.Pinboard('MY PINBOARD API')
bookmarks = pb.posts.all(tag=["work"])
script = '''
tell application "EagleFiler"
tell current library document
set _urls to source URL of every library record
end tell
return _urls
end tell
proc = subprocess.Popen(['osascript', '-'],
urls = proc.communicate(script.format())
urls = urls[0].split(', ')
for bm in bookmarks:
if bm.url not in urls:
tagnames = ', '.join(map(lambda x: '"'+ str(x) + '"', bm.tags))
script = '''
tell application "EagleFiler"
try -- ignore duplicates
set {{_record}} to import URLs "{url}" Web page format Web archive format
set the title of the _record to "{title}"
set the basename of the _record to "{title}"
set the note text of the _record to "{description}"
set the assigned tag names of the _record to {{{tags}}}
set the creation date of the _record to date "{date}"
end try
end tell
proc = subprocess.Popen(['osascript', '-'],
stdout_output = proc.communicate(script.format(url=bm.url,
title=bm.description, description=bm.extended, tags=tagnames,
date=bm.time.strftime('%A, %B %-m, %Y at %H:%M:%S %p')))
I don’t think you should rely on parsing the script output using a comma. I would suggest something like this:
tell application "EagleFiler"
tell current library document
set _urls to source URL of every library record
end tell
set AppleScript's text item delimiters to return
return _urls as Unicode text
end tell
and then splitting the lines.
The Python part could be slow if there are a lot of URLs. I would do something like:
urls = Set(urls)
so that lookups are faster.
Lastly, I think you might run into trouble if any of the parameters that you are interpolating into the generated script contain " or \. It might be better to pass them into osascript as arguments, and then they would received by the handler:
Hi @Michael_Tsai thanks for the feedback, very useful. I didn’t know you could change how AppleScript returns a list, and the Python advice is well taken.
It affects the way AppleScript converts a list to/from text. Before, you were getting an implicit conversion from the list by osascript with the default delimiter (or whatever it was last set to). The modified version makes both explicit.
set pinboardUrl to "https://api.pinboard.in/v1/posts/recent?auth_token=" & pinboardAuth
set sourceXml to do shell script "curl " & pinboardUrl
set theUrls to {}
tell application "System Events"
set _xmlData to make new XML data with properties {text:sourceXml}
tell XML element "posts" of _xmlData
repeat with _post in XML elements
tell _post
copy {href:value of XML attribute "href", theTags:value of XML attribute "tag"} to end of theUrls
end tell
end repeat
end tell
end tell
tell application "EagleFiler"
tell current library document
set _existingUrls to source URL of every library record
repeat with _theUrl in theUrls
if _existingUrls does not contain href of _theUrl then
set {_record} to import URLs {href of _theUrl}
set _newTags to {}
repeat with _theTag in words of theTags of _theUrl
copy tag _theTag to end of _newTags
end repeat
set _record's assigned tags to _newTags
end try
end if
end repeat
end tell
end tell