As part of analyzing my 18 months as a nomad, I wanted to look at how many pictures I took and analyze some of the metadata to determine photos by state, photos by trip, and more.
I assumed this was going to be an easy task, all I wanted to do was download flickr metadata to a csv file. I figured I would just use some type of export feature in flickr. It turns out it’s not so simple mostly because flickr doesn’t include a feature to export metadata to a csv file (why flickr? why????).
I did a lot of google searching (and soul searching to see how important it was to have this data) and eventually came upon this post by Joshua at HunterTrek who created python script for getting metadata using the Flickr API.
Part of me felt like giving up as there are quite a few steps involved in getting it to work, namely downloading a few programs so you can run a python script. Ultimately the allure of data was too much, and I followed their steps.
So here is how to export flickr metadata to a csv file:
- Downloaded ActivePython for Mac.
- Downloaded Flickr API vPython 2.7.
- Installed the Flickr API following these instructions.
- Downloaded Flickr Metadata Python script from Joshua.
- Requested a Flickr API Key for non-commercial use and got one.
- Determined my Flickr ID (top right corner).
- Opened the Python script, added the API, secret, and ID for my photos.
- Ran the Python script and got an error, “flickr.get.token.part.one”.
- Googled and found this solution to error, “flickr.get.token.part.one”.
- Ran the Python script successfully.
- Exported the database to CSV using a terminal command flickr_photo_metadata_download.py -export.
- Opened the CSV in Excel.
Now ideally this is where the story would end. And it almost did, but unfortunately the data exported from the Python script does not include Album (aka Photoset) information. That’s one of the pieces I wanted to analyze. So I looked for a way to modify the script and (eventually) was successful.
A bit more google-action helped me determine how to grab photoset information from the API. Weirdly it’s not in the getinfo call like everything else is but rather via flickr.photos.getAllContexts.
Through trial and error, I got it working by changing / adding the following lines to the Python Script:
In dedup_photos added the bolded text to
db.execute("CREATE TABLE temptable (id int, photo_title text, photo_origformat text, photo_media text, photo_description text, photo_date_posted text, photo_date_taken text, photo_url text, photo_album text)")
In export added the bolded text to
outputwriter.writerow(['PhotoID', 'FileName', 'FileFormat', 'MediaType', 'Description', 'UploadDateTime', 'CreatedDateTime', 'URL', 'PhotoSets', 'Tags'])
In connecting to the database I added the bolded text to
db.execute("CREATE TABLE IF NOT EXISTS photos(id int, photo_title text, photo_origformat text, photo_media text, photo_description text, photo_date_posted text, photo_date_taken text, photo_url text, photo_album text)")
In querying flickr for photo metadata I added the following lines after photo_url
photoalbuminfo = flickr.photos_getAllContexts(photo_id=id)
photo_album = photoalbuminfo.find('set').attrib['title'] #gets the first album set
And the bolded text to the
photo_all_info = (id, photo_title, photo_origformat, photo_media, photo_description, photo_date_posted, photo_date_taken, photo_url, photo_album)
And an extra question mark in
db.execute("INSERT INTO photos values (?,?,?,?,?,?,?,?,?)”,photo_all_info)
If you want to do the same thing, you can download an updated copy of Joshua’s python script here: Flickr Download Photo Metadata with Photoset Information