Convert any file with Apps Script

The Drive API offers a whole range of conversions between mimeTypes, but it’s a little fiddly to figure out exactly how. This library takes a file and an a desired output format and converts it for you. Sometimes, there’s not a direct route – for example if you need to convert a word file to a pdf, it first needs to get converted to a Google Doc, then to a Pdf. This library automatically works out and actions any intermediate conversions required.

Page Content hide

1 Any file ?

3.2 Google Sheet to Excel

Any file ?

Not any file conversion of course, but any file conversion that can be done using the Drive API. I won’t list them all here, but Drive knows about 53 different mimeTypes, can do 42 different types of imports, and 26 different kinds of exports – so there are many different combinations of input and outputs that are supported.

Getting started

You’ll need this library

bmWeConvertAnyFile – 1dZjyNnLCMS_2oEcFRjOf9oYu0qNHkfkfQovycSytn1FhsABTXy0Wnp4z (github)

Test folder

I put a series of test files in folder id ‘1L-b0Ma8M8jRQtGWCQLf2U-yDLXfEMEWr’, which is public and can be used for testing if required.

Exports script

I like to use an Exports file to organize scripts and abstract the library source – you can copy and use this one in your project which is already set up to use the conversion library.

var Exports = {


  get libExports() {
    return bmWeConvertAnyFile.Exports
  },

  get Deps() {
    return this.libExports.Deps
  },

  get AppStore () {
    return this.libExports.AppStore
  },

  
  /**
   * RouteFinder instance with validation
   * @param {...*} args
   * @return {RouteFinder} a proxied instance of RouteFinder with property checking enabled
   */
  newRouteFinder(...args) {
    return this.libExports.newRouteFinder(...args)
  },

  /**
   * Drv instance with validation
   * @param {...*} args
   * @return {Drv} a proxied instance of Drv with property checking enabled
   */
  newDrv(...args) {
    return this.libExports.newDrv(...args)
  },

  /**
   * Utils namespace
   * @return {Utils} 
   */
  get Utils() {
    return this.libExports.Utils
  },
  
  
}

exports

Dependencies

Since the library is dependency free, you’ll need to pass over your fetch and token services like this as the first thing in any code. Note also the comment with DriveApp mentioned – this is not used or executed, but it does cause Apps Script to provoke an auth dialog. You’ll also need to enable the drive API. I find the easiest way to do this to run your script once to let it fail, then follow the link given in the error message to enable the API.

const setup = () => {

  // provoke a drive auth dialog DriveApp.getRootFolder()

  // set up dependencies for libraries
  Exports.Deps.init({
    tokenService: ScriptApp.getOAuthToken,
    fetch: UrlFetchApp.fetch
  })

}

setup

Examples init

All the examples start with this – if the output folder (specified by path) doesn’t exist it will create it.

  // create dependencies
  setup()

  // get a drv instance
  const drv = Exports.newDrv()

  // the shared folder
  const inputFolderId = '1L-b0Ma8M8jRQtGWCQLf2U-yDLXfEMEWr'

  // the path to the output folder - where you want to put the results
  const outputFolderPath = '/test/conversions/output'

  // get the  outputfolder from its path
  const outputFolder  = drv.getFolder({ 
    path: outputFolderPath, 
    createIfMissing: true 
  }).data

setup

Drv API

The library also provides access to my bmApiCentral library which contains a collection of API wrappers, including one to the Drive REST API, to do stuff that isn’t easily available (or available at all) through a built in Apps Script service. This library supports using either Drive Ids or folder paths to specify target files. You don’t need to add it to your project explicitly as it’s exposed automatically via the conversion library, which is the main subject of this article.

Example

Some of the conversions you might want to do could be

Anything to PDF (for example a word document, or a google slide)
A Google Sheet to Excel
A PDF to a word document
A PNG image to a JPEG
A PNG image to a word document – OCR is automatic
…etc

Word to pdf

This will find all the word files (application/word) and convert them to pdfs (application/pdf), and put the result in /test/conversions/output.

  // get all the word files in the folder and convert them to pdf
  const conversions = drv.list ({
      id: inputFolderId, 
      query: `mimeType = 'application/msword'`
    })
    .data
    .files
    .map(file=>Exports.AppStore.convert({
       outputFolder, 
       to: 'application/pdf', 
       file
    }))

word to pdf

Google Sheet to Excel

As above, but this time convert all the sheets in a given folder to excel.

    drv.list ({
      id: inputFolderId, 
      query: `mimeType = 'application/vnd.google-apps.spreadsheet'`
    })
    .data
    .files
    .map(file=>Exports.AppStore.convert({
      outputFolder, 
      to: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet', 
      file
    }))

sheets to excel

PDF to word

As above, but all the PDFs to word

    drv.list ({
      id: inputFolderId, 
      query: `mimeType = 'application/pdf'`
    })
    .data
    .files
    .map(file=>Exports.AppStore.convert({
      outputFolder, 
      to: 'application/vnd.openxmlformats-officedocument.wordprocessingml.document', 
      file
    }))

pdf to word

Png to jpeg

As above but convert png to jpeg

    drv.list ({
      id: inputFolderId, 
      query: `mimeType = 'image/png'`
    })
    .data
    .files
    .map(file=>Exports.AppStore.convert({
      outputFolder, 
      to: 'image/jpeg', 
      file
    }))

png to jpeg

PNG to Word

Converting images to other (non image formats) for example, PDF, Google Doc, Word etc.. automatically applies OCR to attempt to convert any text in the document to editable text in the target document. The code is exactly the same as the other examples

  const conversions = drv.list ({
      id: inputFolderId, 
      query: `mimeType = 'image/png'`
    })
    .data
    .files
    .map(file=>Exports.AppStore.convert({
       outputFolder, 
       to: 'application/vnd.openxmlformats-officedocument.wordprocessingml.document', 
       file
    }))

Conversions with auto OCR

The file object

In all the examples above, the file and outputFolder are passed to the convert method. They contain a number of properties which are generated by drv. In case you generate these in some other way, you can pass the id properties specifically. Here is an example call to the convert method where you have obtained the ids elsewhere.

    const conversion = Exports.AppStore.convert({
       outputFolder: {
        id: "1yKfj31A59pQG4ZfGogWgDIYkQWjexB7m"
       },
       to: 'application/vnd.openxmlformats-officedocument.wordprocessingml.document', 
       file: {
        id: "1L5ZEYbMv3ZnADLUVQqe3NvloHh6IbkFo"
       }
    })

example direct single call to convert

Response

In additions to writing the target file, a conversion will also provide a response in case you want to do some further processing – for example, emailing the converted file. The response for the png->word example looks like this. You can see the details of the files created at each stage of the conversion. The deletions property contain any temporary stages in the conversion process, and are automatically deleted after conversion, but if you happen to need them for any reason you can of course retrieve them from the bin using the given ids.

[{
	"file": {
		"kind": "drive#file",
		"id": "1eykJl63zx6wvF-irZqcuYZyC1huk1HU5",
		"name": "ocrexample.docx",
		"mimeType": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
		"parents": ["1yKfj31A59pQG4ZfGogWgDIYkQWjexB7m"],
		"md5Checksum": "382293a9904dad1d20ca19f9c744995b",
		"size": "110888"
	},
	"route": [{
		"from": "image/png",
		"to": "application/vnd.google-apps.document",
		"action": "import",
		"files": {
			"input": {
				"kind": "drive#file",
				"mimeType": "image/png",
				"id": "1L5ZEYbMv3ZnADLUVQqe3NvloHh6IbkFo",
				"name": "ocrexample.png"
			},
			"output": {
				"kind": "drive#file",
				"id": "1GvTn_LuIy2G9_Px08d669FsKIvuyA4EzXlT35jxkguA",
				"name": "__canbedeleted__7a65a98a94f43e7d1a73829b77317da7",
				"mimeType": "application/vnd.google-apps.document",
				"parents": ["0AN2ExLh4POiZUk9PVA"],
				"size": "1"
			}
		}
	}, {
		"from": "application/vnd.google-apps.document",
		"action": "export",
		"to": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
		"files": {
			"input": {
				"kind": "drive#file",
				"id": "1GvTn_LuIy2G9_Px08d669FsKIvuyA4EzXlT35jxkguA",
				"name": "__canbedeleted__7a65a98a94f43e7d1a73829b77317da7",
				"mimeType": "application/vnd.google-apps.document",
				"parents": ["0AN2ExLh4POiZUk9PVA"],
				"size": "1"
			},
			"output": {
				"kind": "drive#file",
				"id": "1eykJl63zx6wvF-irZqcuYZyC1huk1HU5",
				"name": "__canbedeleted__382293a9904dad1d20ca19f9c744995b",
				"mimeType": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
				"parents": ["0AN2ExLh4POiZUk9PVA"],
				"md5Checksum": "382293a9904dad1d20ca19f9c744995b",
				"size": "110888"
			}
		}
	}],
	"deletions": [{
		"kind": "drive#file",
		"id": "1GvTn_LuIy2G9_Px08d669FsKIvuyA4EzXlT35jxkguA",
		"name": "__canbedeleted__7a65a98a94f43e7d1a73829b77317da7",
		"mimeType": "application/vnd.google-apps.document",
		"parents": ["0AN2ExLh4POiZUk9PVA"],
		"size": "114639"
	}]
}]

example response

How does it work ?

Each of the examples above have exactly the same code, (other than the from/to mimeTypes) but the conversion method varies behind the scenes. Some conversions pass through more than 1 step (for example word->pdf is first imported into to a google doc, then exported to a pdf)

RouteFinder

There are 4 stages of conversions (actions) that the API might choose to combine.

Importing a file (eg a Word file) to a Google Document
Exporting a file (eg a PDF) from a Google Document
Converting (blobVerting) a blob (eg 1 image type to another)
Copying a file – (pdf to pdf)

The library features a RouteFinder class which is at the heart of the conversion logic, and which is used by the convert method to figure out how to convert from one format to another. Although it’s not necessary to convert a file, you may want to use that directly to find out what conversions are possible, and how they work.

You can get an instance of RouteFinder like this, if you are interested in digging into the detail.

const rf = Exports.newRouteFinder()

get a routefinder instance

Known types.

To fetch all the known mimetypes – returns a list like below. Since RouteFinder uses the Drive API itself to discover what type of conversions are supported, if Google happen to add any more, it will automatically detect and use those routes.

console.log (rf.knownTypes)

/*
[ 'application/epub zip',
  'application/msword',
  'application/pdf',
  'application/rtf',
  'application/vnd.google-apps.document',
  'application/vnd.google-apps.drawing',
  'application/vnd.google-apps.form',
  'application/vnd.google-apps.jam',
  'application/vnd.google-apps.presentation',
  'application/vnd.google-apps.script',
  'application/vnd.google-apps.script json',
  'application/vnd.google-apps.script text/plain',
  'application/vnd.google-apps.site',
  'application/vnd.google-apps.spreadsheet',
  'application/vnd.ms-excel',

  etc...
*/

all known types

exportTypes

List all the types that can be materialized (exported) – produces a list like below showing the google mimeTypes and what they can be exported as

console.log(rf.exportTypes)

/*
{ 'application/vnd.google-apps.document': 
   [ 'application/rtf',
     'application/vnd.oasis.opendocument.text',
     'text/html',
     'application/pdf',
     'application/epub zip',
     'application/zip',
     'application/vnd.openxmlformats-officedocument.wordprocessingml.document',
     'text/plain' ],
  'application/vnd.google-apps.spreadsheet': 
   [ 'application/x-vnd.oasis.opendocument.spreadsheet',
     'text/tab-separated-values',
     'application/pdf',
     'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
     'text/csv',
     'application/zip',
     'application/vnd.oasis.opendocument.spreadsheet' ],
  'application/vnd.google-apps.jam': [ 'application/pdf' ],
  'application/vnd.google-apps.script': [ 'application/vnd.google-apps.script json' ],
  'application/vnd.google-apps.presentation': 
   [ 'application/vnd.oasis.opendocument.presentation',
     'application/pdf',
     'application/vnd.openxmlformats-officedocument.presentationml.presentation',
     'text/plain' ],
etc..
*/

exportTypes

importTypes

gets a list of all mimeTypes that can be imported into a Google type document as below.

console.log(rf.importTypes)
/*
{ 'application/x-vnd.oasis.opendocument.presentation': [ 'application/vnd.google-apps.presentation' ],
  'text/tab-separated-values': [ 'application/vnd.google-apps.spreadsheet' ],
  'application/vnd.ms-excel.sheet.macroenabled.12': [ 'application/vnd.google-apps.spreadsheet' ],
  'application/vnd.openxmlformats-officedocument.wordprocessingml.template': [ 'application/vnd.google-apps.document' ],
  'application/vnd.ms-powerpoint.presentation.macroenabled.12': [ 'application/vnd.google-apps.presentation' ],
  'application/vnd.ms-word.template.macroenabled.12': [ 'application/vnd.google-apps.document' ],
  'application/vnd.openxmlformats-officedocument.wordprocessingml.document': [ 'application/vnd.google-apps.document' ],
  ...etc
  */

importTypes

find

To find a route between 2 mimetypes. Here’s a few examples with single and multiple steps shown

console.log(rf.find ({
  from: 'application/vnd.openxmlformats-officedocument.presentationml.slideshow',
  to: 'application/vnd.google-apps.presentation'
}))
/*
[ { from: 'application/vnd.openxmlformats-officedocument.presentationml.slideshow',
    to: 'application/vnd.google-apps.presentation',
    action: 'import' } ]
*/

  console.log(rf.find({
    from: 'image/bmp',
    to: 'image/png',
  }))
/*
[ { from: 'image/bmp', action: 'blobvert', to: 'image/png' } ]
*/

  console.log(rf.find({
    from: 'application/vnd.openxmlformats-officedocument.wordprocessingml.document',
    to: 'application/pdf',
  }))

/*
[ { from: 'application/vnd.openxmlformats-officedocument.wordprocessingml.document',
    to: 'application/vnd.google-apps.document',
    action: 'import' },
  { from: 'application/vnd.google-apps.document',
    action: 'export',
    to: 'application/pdf' } ]
*/

find

More about drv

There’s not too much more to say about the convertor, so let’s look a little closer at the drv class.

Parameters

Most the classes in the bmApiCentral library, (used by bmWeConvertAnyFile) share some standard boolean parameters

noCache – whether to disable caching
noisy – whether to log the url’s being fetched and if the result came from cache
throwOnError – whether to throw an error if one is detected
throwOnWarning – wheteher to throw an error if a minor one is detected
loggy – whether to log interesting events

I usually set these in a single object – very noisy while testing, but quieter when deployed.

  const apiParams = {
    noisy: true,
    noCache: true,
    throwOnError: true,
    throwOnWarning: false,
    loggy: true
  }

apiParams

Then make calls to all api methods like this. You can of course specify any of these parameters individually if you prefer,

Exports.AppStore.convert({ 
  ...apiParams, 
  file, 
  to: target, 
  outputFolder 
})

using apiParams

Using paths

Drv accepts paths and ids, so to get files of a given name in a folder selected by its path, you can do this

    const wordsToPdf = drv.getFiles({
      path: '/routefinder', 
      name: 'file-sample_100kB.docx' 
    })
      .data
      .files
      .map(file => Exports.AppStore.convert({ 
        outputFolder, 
        to: 'application/pdf', 
        file 
       }))

folders by path

Links

Example test files – shared drive folder

bmWeConvertAnyFile – 1dZjyNnLCMS_2oEcFRjOf9oYu0qNHkfkfQovycSytn1FhsABTXy0Wnp4z (github)

bmApiCentral – 1L4pGblikbjQLQp8nQCdmCfyCxmF3MIShzsK8yy_mJ9_2YMdanXQA75vI (github)

apiCentral is a lite (google apis only) version of SuperFetch caching: How does it work? which contains API wrappers for a mix of APIs as well as some more esoteric capabilities. apiCentral uses all the same caching techniques on behalf of the conversion library should you need them. Caching can be turned off by passing the noCache: true parameter to the convert method.

Desktop Liberation

The definitive resource for Google Apps Script and Microsoft Office automation

Convert any file with Apps Script

Any file ?

Getting started

Test folder

Exports script

Dependencies

Examples init

Drv API

Example

Word to pdf

Google Sheet to Excel

PDF to word

Png to jpeg

PNG to Word

The file object

Response

How does it work ?

RouteFinder

Known types.

exportTypes

importTypes

find

More about drv

Parameters

Using paths

Links

More