In Apps Script V8: Arraybuffers and Typed arrays we had a first look at how in v8, you can process and map binary data to JavaScript types for easy access. In Apps Script V8: Multiple script files, classes and namespaces, we also looked at v8 classes. This article will dig in a little bit more and look at how to extend classes and apply that easily handling data structures presented as an array of bytes as you’d have to do when dealing with binary data from a file.

Motivation

IMG files are a good example of complicated structured you might need to dig into, so let’s create a standard img class, with extensions for jpg, gif,png and bmp variants. From that we’ll be able to extract the width and height (a much simpler way of doing this would be to look at the imageMediaMetadata property of the response from a Drive API query, but there’s really  no fun in that)

Why you might need this

If you are dealing with Binary files or streaming, ArrayBuffers and Typed arrays complements the Apps Script Blob utilities.

The base class

Each of our image types follow the same principle, but the details vary. We’ll use the idea of a base class (AImg) , which will be common across all image types, and a class extension tweaking the specifics for each image type. Here’s the constructor

There’s a few things to expand upon here.

Endianness

Machine architecture (especially in the early days), varied in the way that numbers are stored. Those of you who grew up having to juggle between mainframes, minis and then Intel will probably be familiar with the complication of the byte order of numbers, but nowadays it’s pretty standard. However, some of these image file formats were created a long time ago, so the order in which they store bytes internally reflected the machines the creators were using at the time, so this is a little problem you have to be aware of when dealing with binary data.

See if you can figure out what this is doing.

The two main types of ‘endianness’ (there used to be others) are ‘big endian’ and ‘little endian’, and it refers to whether the most significant bytes come first (big endian)  or last (little endian).

Here’s how to test.

  1. create an array with a single 2 byte(16bit) number and get that as a buffer
  2. convert that buffer to a 1 byte array, and take the first element
  3. If the value of that first byte matches the least significant part of the 16 bit number we first thought of, then the machine architecture is little endian.
     

Dataviews

In Apps Script V8: Arraybuffers and Typed arrays I used regular ArrayBuffer syntax like

to map buffer offets to types of data, but given that we now have to deal with endianness of  the binary data potentially being different from that of the machine architecture processing it, we need another mechanism. That’s where dataviews come in.  A dataview is defined like this, and provides a ‘window’ onto the buffer from which data can be extracted as various types.

Extracting a 32 bit number from a buffer, taking account of endianness

Since the size of numbers will vary between file types, we can generalize this a bit with

Where this._bits will be set for each image type.

Getters and methods

Here’s the complete base class with its getters and methods.

Checktype

These image files are generally identified by a signature of some kind. The checkType method will validate that the signature (usually mapped to this_.version) is indeed the expected one.

Class extensions

Extending a class means taking one already defined class, and making a new class, adding new stuff and/or overriding existing properties or methods in the original. We’ll create 4 new classes

ABmp,  AGif, APng and AJpg

Mainly, we just need a constructor which sets up the parameters that differ between image types. These will generally refer to the offset to find the height and width.

Note the syntax for definiting an extension, and also that the constructor of an extension first needs to call super (args). This executes the constructor of the base class it’s based on before continuing on with its own constructor tasks. It’s important that you always do this.

ABmp

AGif

APng

The png file has a couple of specifics

  • it’s bigendian
  • the numbers are 32 bit rather than 16 bit

AJpg

The JPG is the most complex of the group, as it’s divided into blocks, with each block holding information about what it is and how big it is. The means that we have to skip through the file looking for the block (known as SOF0)  that contains the width and height.  It’s also bigEndian. Like the other classes, the objective to find the offset of the width and height so we can use a dataview to extract those values.

Using the classes

That’s the hard bit done. Now it’s very simple to use them to get the width/height info from Image files.  Here’s a selector that will pick the correct class depending on the mimetype of a blob.

Get a file and call that

Finally, an example for each

Summary

In an ideal world a couple of simple setters is all that would be needed to update the width and height to rescale the file, but sadly it’s not as simple as that with image files. There are already ways of doing that using Apps Script APIS and libraries, so that’s for another day and another article.