In Apps Script V8: Arraybuffers and Typed arrays we had a first look at how in v8, you can process and map binary data to JavaScript types for easy access. In Apps Script V8: Multiple script files, classes and namespaces, we also looked at v8 classes. This article will dig in a little bit more and look at how to extend classes and apply that easily handling data structures presented as an array of bytes as you’d have to do when dealing with binary data from a file.
Motivation
IMG files are a good example of complicated structured you might need to dig into, so let’s create a standard IMG class, with extensions for jpg, gif, png and BMP variants. From that, we’ll be able to extract the width and height (a much simpler way of doing this would be to look at the imageMediaMetadata property of the response from a Drive API query, but there’s really no fun in that)
Why you might need this
If you are dealing with Binary files or streaming, ArrayBuffers and Typed arrays complement the Apps Script Blob utilities.
The base class
Each of our image types follows the same principle, but the details vary. We’ll use the idea of a base class (AImg) , which will be common across all image types, and a class extension tweaking the specifics for each image type. Here’s the constructor
class AImg { constructor (bytes) { // confirm endianness this._littleEndian = new Uint8Array(new Uint16Array([0xbbee]).buffer)[0] === 0xee; if (!this._littleEndian) { throw 'woah - apps script should be little endian!!' } // create a buffer and populate it with the initial bytes this._buffer = new Uint8Array(bytes).buffer this._type = 'unknown' // use this view to extract numbers this._view = new DataView(this._buffer) // which are generally 16 bit ones this._bits = 16 }
There’s a few things to expand upon here.
Endianness
Machine architecture (especially in the early days), varied in the way that numbers are stored. Those of you who grew up having to juggle between mainframes, minis and then Intel will probably be familiar with the complication of the byte order of numbers, but nowadays it’s pretty standard. However, some of these image file formats were created a long time ago, so the order in which they store bytes internally reflected the machines the creators were using at the time, so this is a little problem you have to be aware of when dealing with binary data.
See if you can figure out what this is doing.
this._littleEndian = new Uint8Array(new Uint16Array([0xbbee]).buffer)[0] === 0xee; if (!this._littleEndian) { throw 'woah - apps script should be little endian!!' }
The two main types of ‘endianness’ (there used to be others) are ‘big endian’ and ‘little endian’, and it refers to whether the most significant bytes come first (big endian) or last (little endian).
Here’s how to test.
- create an array with a single 2 byte(16bit) number and get that as a buffer
new Uint16Array([0xbbee]).buffer
- convert that buffer to a 1-byte array, and take the first element
new Uint8Array(new Uint16Array([0xbbee]).buffer)[0]
- If the value of that first byte matches the least significant part of the 16-bit number we first thought of, then the machine architecture is little endian.
this._littleEndian = new Uint8Array(new Uint16Array([0xbbee]).buffer)[0] === 0xee; if (!this._littleEndian) { throw 'woah - apps script should be little endian!!' }
Dataviews
In Apps Script V8: Arraybuffers and Typed arrays I used regular ArrayBuffer syntax like
this._id = new Int32Array(this._buffer, 0 ,1)
to map buffer offets to types of data, but given that we now have to deal with endianness of the binary data potentially being different from that of the machine architecture processing it, we need another mechanism. That’s where data views come in. A data view is defined like this and provides a ‘window’ onto the buffer from which data can be extracted as various types.
this._view = new DataView(this._buffer)
Extracting a 32-bit number from a buffer, taking account of endianness
getValue (offset) { return this.view.getUint32(offset, this._littleEndian ) }
Since the size of numbers will vary between file types, we can generalize this a bit with
getValue (offset) { return this.view[`getUint${this._bits}`](offset, this._littleEndian ) }
Where this._bits will be set for each image type.
Getters and methods
Here’s the complete base class with its getters and methods.
class AImg { constructor (bytes) { // confirm endianness this._littleEndian = new Uint8Array(new Uint16Array([0xbbee]).buffer)[0] === 0xee; if (!this._littleEndian) { throw 'woah - apps script should be little endian!!' } // create a buffer and populate it with the initial bytes this._buffer = new Uint8Array(bytes).buffer this._type = 'unknown' // use this view to extract numbers this._view = new DataView(this._buffer) // which are generally 16 bit ones this._bits = 16 } bufferToString (buf) { const b = new Uint8Array(buf) //terminated by \0 if its shorter than the length of the allocated buffer const len = b.indexOf(0) return String.fromCharCode.apply(null, buf.slice(0,len < 0 ? b.length : len ) || b.length) } // convert to hex and leading '0' fill bufferToHex (buf) { return Array.from(buf).map(f=>f.toString(16).padStart(2,'0')).join('') } // return the current buffer as an array of bytes get bytes () { return Array.from(new Uint8Array(this.buffer)) } get buffer () { return this._buffer } // the version is the file identifier get version () { return this.bufferToString(this._version) } // the view can be reused to overlay the buffer for different types get view () { return this._view } // use this to extract a number getValue (offset) { return this.view[`getUint${this._bits}`](offset, this._littleEndian ) } get width (){ return this.getValue (this._widthOffset) } get height () { return this.getValue (this._heightOffset) } get type () { return this._type } // validate that file ident is indeed what is expected for that type of file checkType ({ident, type, value = null, hex = true}) { // convenience to allow ident to be supplied in either hex or string // normally we're checking agsint the version, but this is to enable ad hoc checks if(hex) { value = this.bufferToHex(value || this._version) } else { // this.version is already a string so no need to convert it value = value || this.version } if (ident !== value) { throw `${value} is not ${ident} (${type})` } else { this._type = type } } }
Checktype
These image files are generally identified by a signature of some kind. The checkType method will validate that the signature (usually mapped to this_.version) is indeed the expected one.
Class extensions
Extending a class means taking one already defined class, and making a new class, adding new stuff and/or overriding existing properties or methods in the original. We’ll create 4 new classes
ABmp, AGif, APng and AJpg
Mainly, we just need a constructor which sets up the parameters that differ between image types. These will generally refer to the offset to find the height and width.
Note the syntax for defining an extension, and also that the constructor of an extension first needs to call super (args). This executes the constructor of the base class it’s based on before continuing on with its own constructor tasks. It’s important that you always do this.
ABmp
class ABmp extends AImg { constructor (bytes) { super (bytes) this._version = new Uint8Array(this.buffer,0, 2 ) this._widthOffset = 18 this._heightOffset = 22 this.checkType({ ident:'424d', type: 'bmp' }) } }
AGif
class AGif extends AImg { constructor (bytes) { super (bytes) // specific to a GIF this._version = new Uint8Array(this.buffer,0, 6 ) this._widthOffset = 6 this._heightOffset = 8 this.checkType({ ident:'GIF89a', type: 'gif', hex: false }) } }
APng
The png file has a couple of specifics
- it’s bigendian
- the numbers are 32 bit rather than 16 bit
class APng extends AImg { constructor (bytes) { super (bytes) // starts with a PNG signature this._version = new Uint8Array(this.buffer,0, 8 ) // PNG is big endian, so we need to use a view to swap the bytes this._littleEndian = false this._widthOffset = 16 this._heightOffset = 20 this._bits = 32 // this is the signature of a png file this.checkType({ ident:'89504e470d0a1a0a', type: 'png' }) } }
AJpg
The JPG is the most complex of the group, as it’s divided into blocks, with each block holding information about what it is and how big it is. The means that we have to skip through the file looking for the block (known as SOF0) that contains the width and height. It’s also bigEndian. Like the other classes, the objective to find the offset of the width and height so we can use a data view to extract those values.
class AJpg extends AImg { constructor (bytes) { super (bytes) this._littleEndian = false const type = 'jpg' const data = new Uint8Array(this.buffer) // first find the position of the SOI marker let blockIndex = data.findIndex((f,i,a)=>f===0xff && a[i+1] === 0xd8) if(blockIndex === -1) throw new Error('couldnt find SOI in jpg file') // check the type const jfifIndex = blockIndex + 6 const jfif = this.bufferToString(data.slice(jfifIndex, jfifIndex + 4)) this.checkType({ ident:'JFIF', type, hex: false, value: jfif }) // skip the SOI as its different format that the rest blockIndex += (this.view.getUint16(blockIndex+4, this._littleEndian) + 4) // sure we have a jpeg now // we need to keep skipping frames till we find the SOF0 frame, identified by const sof0Marker = 0xc0 while (blockIndex < data.length) { const blockSize = this.view.getUint16(blockIndex+2, this._littleEndian) + 2 const startMarker = this.view.getUint8(blockIndex) const marker = this.view.getUint8(blockIndex+1) if (startMarker !== 0xff) { throw new Error(`expected a block marker ff but found ${startMarker.toString(16)}`) } if (marker === sof0Marker) { this._widthOffset = blockIndex +5 this._heightOffset = blockIndex +7 // force an exit blockIndex = data.length +1 } else { // skip this block blockIndex += blockSize } } } }
Using the classes
That’s the hard bit done. Now it’s very simple to use them to get the width/height info from Image files. Here’s a selector that will pick the correct class depending on the mime-type of a blob.
const getImg = (blob) => { const type = blob.getContentType() const bytes = blob.getBytes() switch (type) { case 'image/gif': return new AGif(bytes) case 'image/bmp': return new ABmp(bytes) case 'image/jpeg': case 'image/jpg': return new AJpg(bytes) case 'image/png': return new APng(bytes) default: throw new Error(`unknown type ${type}`) } }
Get a file and call that
const getFileInfo = (id) => { const file = DriveApp.getFileById(id) const blob = file.getBlob() return { file, img: getImg(blob) } }
Finally, an example for each
const getPng = () => { const {img} = getFileInfo('0B92ExLh4POiZSEl6d3lDc2xWSnc') console.log(`${img.width} ${img.height} ${img.type}`) } // 280 280 png
const getJpg = () => { const {img} = getFileInfo('0B92ExLh4POiZV2QzQVEzUTRIOWc') console.log(`${img.width} ${img.height} ${img.type}`) } // 407 640 jpg
const getGif = () => { const {img} = getFileInfo('1qOvw2VKX7CjQy7PmerbPC55sZrCgv3s9') console.log(`${img.width} ${img.height} ${img.type}`) } //168 160 gif
const getBmp = () => { const {img} = getFileInfo('1-gfFSwZgcl7xowKjih9jfERquYS_yIiL') console.log(`${img.width} ${img.height} ${img.type}`) } // 300 150 bmp
Summary
In an ideal world, a couple of simple setters is all that would be needed to update the width and height to rescale the file, but sadly it’s not as simple as that with image files. There are already ways of doing that using Apps Script APIS and libraries, so that’s for another day and another article.