Publishing Org Documents to Google Drive
Table of Contents
I recently had to write a couple of company policies and playbooks at work, where we use Google Drive extensively.
I started writing the documents on Google Docs directly, but quickly found how painful it was, at least when compared to writing documents in Orgmode.
So I switched to writing them in Orgmode, thinking that publishing and uploading to Google Drive would be straightforward. I thought I could just publish the documents as HTML and upload them to Google Drive. Unfortunately it wasn’t quite as simple as that.
Publishing
My first attempt was to publish the Org files using the HTML backend then upload those into Google Drive and convert them to Google Docs. But the resulting Google Doc file was not very good. The links in the table of contents were not working at all, the formatting was off, etc.
My next attempt was to export as ODT files. This time the conversion
to Google Docs was much better and had none of the issues the HTML
files had. Satisfied with the conversion result, I quickly got to
configuring my Orgmode project to use the ODT backend for publishing.
But to my annoyance, the ox-odt.el
package did not have a publishing
function useable with org-publish-project-alist
at all.
But with a little bit of digging and a lot of trial and error, I ended up with the following implementation of a publishing function for ODT:
1: ;;;###autoload 2: (defun org-odt-publish-to-odt (plist filename pub-dir) 3: "Publish an org file to ODT. 4: 5: FILENAME is the filename of the Org file to be published. PLIST 6: is the property list of the given project. PUB-DIR is the publishing 7: directory. 8: 9: Return output file name." 10: (unless (or (not pub-dir) (file-exists-p pub-dir)) (make-directory pub-dir t)) 11: ;; Check if a buffer visiting FILENAME is already open. 12: (let* ((org-inhibit-startup t) 13: (visiting (find-buffer-visiting filename)) 14: (work-buffer (or visiting (find-file-noselect filename)))) 15: (unwind-protect 16: (with-current-buffer work-buffer 17: (let ((outfile (org-export-output-file-name ".odt" nil pub-dir))) 18: (org-odt--export-wrap 19: outfile 20: (let* ((org-odt-embedded-images-count 0) 21: (org-odt-embedded-formulas-count 0) 22: (org-odt-object-counters nil) 23: (hfy-user-sheet-assoc nil)) 24: (let ((output (org-export-as 'odt nil nil nil 25: (org-combine-plists 26: plist 27: `(:crossrefs 28: ,(org-publish-cache-get-file-property 29: (expand-file-name filename) :crossrefs nil t) 30: :filter-final-output 31: (org-publish--store-crossrefs 32: org-publish-collect-index 33: ,@(plist-get plist :filter-final-output)))))) 34: (out-buf (progn (require 'nxml-mode) 35: (let ((nxml-auto-insert-xml-declaration-flag nil)) 36: (find-file-noselect 37: (concat org-odt-zip-dir "content.xml") t))))) 38: (with-current-buffer out-buf (erase-buffer) (insert output)))))))) 39: (unless visiting (kill-buffer work-buffer))))
It’s a combination of org-publish-org-to
, which is what ox-html
and ox-latex
use for publishing as HTML and LaTeX/PDF respectively,
and org-odt-export-to-odt
, the function for exporting individual Org
files as ODT.
Uploading
The next step is to upload the ODT files to Google Drive and convert them as Google Docs.
For this I decided to use Python, since Google provides a great client library for their API’s.
Below is the entire script. It is quite simple and not very flexible, but it works well for my purposes. One prerequisite is that the directory structure in Google Drive must be pre-created manually to match the directory structure of the publish directory.
CREDENTIALS_PATH
: path to where the credentials data will be storedSECRETS_PATH
: path to the client secrets file downloaded from the Google Cloud Console projectPUBLISH_DIR
: path to the:publishing-directory
propertyDRIVE_FOLDER_ROOT
: the ID of the root Google Drive folder where the documents should be uploaded
1: import webbrowser 2: import logging 3: import os 4: import httplib2 5: import googleapiclient.discovery 6: import oauth2client 7: import oauth2client.client 8: import oauth2client.file 9: from googleapiclient.http import MediaFileUpload 10: 11: 12: CREDENTIALS_PATH = 'credentials.json' 13: SECRETS_PATH = 'secrets.json' 14: PUBLISH_DIR = 'published' 15: DRIVE_FOLDER_ROOT = '0000AAAABBBBCCCC' 16: 17: logging.getLogger('googleapiclient').setLevel(logging.WARNING) 18: logging.basicConfig(level=logging.INFO, format='%(message)s') 19: 20: def get_auth(): 21: """ 22: Load credentials from file or otherwise authorize for new credentials. 23: """ 24: storage = oauth2client.file.Storage(CREDENTIALS_PATH) 25: credentials = storage.get() 26: if credentials is None: 27: flow = oauth2client.client.flow_from_clientsecrets(SECRETS_PATH, 28: scope='https://www.googleapis.com/auth/drive', 29: redirect_uri='urn:ietf:wg:oauth:2.0:oob') 30: auth_uri = flow.step1_get_authorize_url() 31: webbrowser.open(auth_uri) 32: auth_code = raw_input('Enter the auth code: ') 33: credentials = flow.step2_exchange(auth_code) 34: credentials.authorize(httplib2.Http()) 35: storage = oauth2client.file.Storage(CREDENTIALS_PATH) 36: storage.put(credentials) 37: return credentials 38: else: 39: return credentials 40: 41: 42: def find_folder_id(client, odt_file): 43: """ 44: Find the correct folder to upload the odt_file to in Google Drive. 45: """ 46: folder_path = os.path.dirname(odt_file).split(os.path.sep) 47: drive_folder_id_path = [DRIVE_FOLDER_ROOT] 48: drive_folder_name_path = [] 49: q = "mimeType='application/vnd.google-apps.folder' and '{id}' in parents" 50: for path in folder_path: 51: if path == PUBLISH_DIR: 52: continue 53: resp = client.files().list(corpora='user', 54: q=q.format(id=drive_folder_id_path[-1])).execute(num_retries=2) 55: if resp.get('files'): 56: for f in resp['files']: 57: if f['name'] == path: 58: drive_folder_id_path.append(f['id']) 59: drive_folder_name_path.append(f['name']) 60: break 61: else: 62: raise ValueError('Failed to find folder "%s" in Google Drive. Make sure it already exists.' % path) 63: if drive_folder_name_path and drive_folder_name_path[-1] == folder_path[-1]: 64: return drive_folder_id_path[-1] 65: 66: 67: def find_existing_file_id(client, folder_id, drive_file_name): 68: """ 69: Find an existing Google Doc file with the same drive_file_name in folder_id. 70: """ 71: q = "mimeType='application/vnd.google-apps.document' and '{id}' in parents and name='{name}'" 72: resp = client.files().list(corpora='user', 73: q=q.format(id=folder_id, name=drive_file_name)).execute(num_retries=2) 74: if resp.get('files'): 75: return resp['files'][0]['id'] 76: 77: 78: def upload(odt_file): 79: """ 80: Upload an individual odt_file to Google Drive. 81: """ 82: if not odt_file.endswith('.odt'): 83: return 84: credentials = get_auth() 85: http = credentials.authorize(httplib2.Http()) 86: client = googleapiclient.discovery.build('drive', 'v3', http=http, cache_discovery=False) 87: folder_id = find_folder_id(client, odt_file) 88: drive_file_name = os.path.basename(odt_file).replace('.odt', '') 89: drive_file_name = [s.title() for s in drive_file_name.split('_')] 90: drive_file_name = ' '.join(drive_file_name) 91: existing_file_id = find_existing_file_id(client, folder_id, drive_file_name) 92: 93: file_metadata = { 94: 'name': drive_file_name, 95: 'mimeType': 'application/vnd.google-apps.document', 96: } 97: media = MediaFileUpload(odt_file, 98: mimetype='application/vnd.oasis.opendocument.text', 99: resumable=True) 100: if not existing_file_id: 101: file_metadata['parents'] = [folder_id] 102: logging.info('Creating "%s" from %s...', drive_file_name, odt_file) 103: file = client.files().create(body=file_metadata, 104: media_body=media, 105: fields='id').execute() 106: else: 107: logging.info('Updating "%s" from %s...', drive_file_name, odt_file) 108: file = client.files().update(fileId=existing_file_id, 109: body=file_metadata, 110: media_body=media, 111: fields='id').execute() 112: 113: 114: def upload_directory(directory): 115: """ 116: Recursively upload all odt files in directory to Google Drive. 117: """ 118: for root, dirs, files in os.walk(directory): 119: for f in files: 120: odt_file = os.path.join(root, f) 121: upload(odt_file) 122: for d in dirs: 123: upload_directory(d) 124: 125: 126: if __name__ == '__main__': 127: upload_directory(PUBLISH_DIR)
The script recursively uploads all ODT documents in the publish directory to the specified folder in Google Drive, and follows the same folder structure. One possible improvement one could do would be to only upload the recently modified files.
So with these two pieces of code in place, I can now publish and upload to Google Drive with the following sequence of commands from Emacs:
C-c C-e P p
to publishM-! python upload2drive.py
to upload to Google Drive