Google Cloud Storage as Cache platform

In Apps script library with plugins for multiple backend cache platforms I covered a way to get more into cache and property services, and  the bmCrusher library came with built in plugins for Drive, CacheService and PropertyService. Because these are designed as plug-ins  we can add any number of back ends as long as they present the same interface methods. Now bmCrusher also has a built in plugin for Cloud Storage.

Benefits of using cloud storage

Aside from potentially being permanent, you can store much bigger volumes of data than the other platforms, and because the concept of compression and spreading the data across multiple physical items is all inherited from the Crusher service, we can minimize space used and avoid issues with upload size limitations. Most importantly, using cloud storage as the back end means you can share data across multiple projects.

How to use

Just as in the other examples, it starts with a store being passed to the crusher library to be initialized. The gcsStore is actually managed by another library, as described here which presents with the same methods as the CacheService, so we can simply reuse the built in CacheService plugin  directly. In any case, the entire plugin is implemented in bmCrusher so you simply have to initialize it. This uses a service account to access the storage bucket, but you can also use the ScriptApp access token if its properly scoped. I’ll cover both options later.


const goa = cGoa.make('gcs_cache', PropertiesService.getUserProperties())
// we can just re-use the cache plugin service as cloud storage library has same methods.
const crusher = new bmCrusher.CrusherPluginGcsService().init({
bucketName: 'bmcrusher-test-bucket-store',
prefix: "/crusher/store/",
tokenService: () => goa.getToken()
})
Initialize the crusher

From now on the exact same methods introduced in Apps script library with plugins for multiple backend cache platforms will now operate on Cloud Storage instead of the  other platforms. Here’s a refresher

Writing

All writing is done with a put method.

crusher.put(key, data[,expiry])
write some data

put takes 3 arguments

  • key – a string with some key to store this data against
  • data – It automatically detects converts to and from objects, so there’s no need to stringify anything.
  • expiry – optionally provide a number of seconds after which the data should expire.

Reading

const data = crusher.get (key)
retrieving data

get takes 1 argument

  • key – the string you put the data against, and will restore the data to it’s original state

Removing

crusher.remove(key)
removing an item

Expiry

The expiry mechanism works exactly the same as the other platforms. If an item is expired, whether or not it still exists in storage, it will behave as if it doesn’t exist. Accessing an item that has expired will delete it. Cloud storage also has lifecycle management. Any item that has an expiry (the default is for the item to be permanent), will be marked for lifecycle management by cloud storage and will eventlually be deleted, up to a day after it expires.

Here’s what some store entries look like on cloud storage

Fingerprint optimization

Since it’s possible that an item will spread across multiple physical records, we want a way of avoiding rewriting (or decompressing) them if nothing has changed. Crusher keeps a fingerprint of the contents of the compressed item. When you write something and it detects that the data you want to write has the same fingerprint as what’s already stored, it doesn’t bother to rewrite the item.

However if you’ve specified an expiry time, then it will be rewritten so as to update its expiry. There’s a catch though. If your chosen store supports its own automatic expiration (as in the CacheService), then the new expiration wont be applied. Sometimes this behavior is what you want, but it does mean a subtle difference between different stores.

You can disable this behavior altogether when you initialize the crusher.


const goa = cGoa.make('gcs_cache', PropertiesService.getUserProperties())
const crusher = new bmCrusher.CrusherPluginGcsService().init({
bucketName: 'bmcrusher-test-bucket-store',
prefix: "/crusher/store/",
tokenService: () => goa.getToken(),
respectDigest: false
})
Always rewrite store even if the data has not changed

Formats

Crusher writes all data as zipped base64 encoded compressed, so the mime type will be text, and will need to be read by bmCrusher to make sense of it, but watch this space – I’ll be adding an option to write vanilla data without the crusher metadata.

Setting up your storage bucket

Actually there’s not much to this. The simplest way is just to make sure you set up your cloud project in the same organization as your apps script project, go to storage, come up with a unique name for your storage bucket, and pass it over when initializing the store. I didn’t even have to reassign my apps script cloud project or enable any APIs

I did have to add these scopes to my apps script manifest though.

 "oauthScopes": ["https://www.googleapis.com/auth/script.external_request", "https://www.googleapis.com/auth/devstorage.full_control"],

Which means that the script oauth token (passed over when initializing the store), is well enough scoped to do stuff with cloud storage. Here’s how to initialize the crusher service with ScriptApp providing the token.


const crusher = new bmCrusher.CrusherPluginGcsService().init({
bucketName: 'bmcrusher-test-bucket-store',
prefix: "/crusher/store/",
tokenService: () => ScriptApp.getOAuthToken()
})
using scriptApp for the access token

but see below for a more flexible alternative, using a service account.

Using a service account rather than relying on ScriptApp

Although you can manually setup the required scopes in your appsscript.json manifest, it can be a little flaky as adding and removing libraries and otherwise doing things that modify your appscript kills the oauth scopes you’ve added manually, so I recommend that you  use a service account to access the storage bucket. With goa it’s a piece of cake.

Create your service account

Go to the cloud console of the project that holds the bucket you want to use, and create a new service account.

google cloud console service account

Create a service account key

Next create a key, give it access to the cloud storage admin role (it needs full control to be able to turn on lifecycle management on the bucket).

service account key

then download the json file containing the service account credentials to drive and take note of its file Id on Drive.

Initialise goa

This is a one off operation  – create, run then delete this function (you won’t need it again), subsitituting in the the fileId of the service account file.

function oneOffgcs() { 

// used by all using this script
var propertyStore = PropertiesService.getUserProperties();
// DriveApp.getFileById()
// service account for cloud download
cGoa.GoaApp.setPackage (propertyStore ,
cGoa.GoaApp.createServiceAccount (DriveApp , {
packageName: 'gcs_cache',
fileId:'your drive file id',
scopes : cGoa.GoaApp.scopesGoogleExpand (['devstorage.full_control']),
service:'google_service'
}));

}
initialize goa service account

Initialize the crusher

This time instead of getting the access token from ScriptApp, we’ll get it from goa.

  const goa = cGoa.make('gcs_cache', PropertiesService.getUserProperties())
// we can just re-use the cache plugin service as cloud storage library has same methods.
const crusher = new bmCrusher.CrusherPluginGcsService().init({
bucketName: 'bmcrusher-test-bucket-store',
prefix: "/crusher/store/",
tokenService: () => goa.getToken()
})
get access token from goa

Plugin code

The plugin is implemented in bmCrusher, but here it is for anyone interested in writing their own plugin. Since the library it uses has the same methods as the CacheService, we can reuse the CacheServicePlugin for most of the work.

function CrusherPluginGcsService() {

// writing a plugin for the Squeeze service is pretty straighforward.
// you need to provide an init function which sets up how to init/write/read/remove objects from the store
// this example is for the Apps Script cache service
const self = this;

// these will be specific to your plugin
let _settings = null;

// standard function to check store is present and of the correct type
function checkStore() {
if (!_settings.store) throw "You must provide a cache service to use";
if (!_settings.chunkSize) throw "You must provide the maximum chunksize supported";
if (!_settings.bucketName) throw "You must provide the bucket to use";
if (!_settings.tokenService || typeof _settings.tokenService !== 'function') throw 'There must be a tokenservice function that returns an access token';
return self;
}

// start plugin by passing settings you'll need for operations
/**
* @param {object} settings these will vary according to the type of store
*/
self.init = function (settings) {

_settings = settings || {};
// set default chunkzise for gcs
_settings.chunkSize = _settings.chunkSize || 5000000;

// respect digest can reduce the number of chunks read, but may return stale
_settings.respectDigest = Utils.isUndefined(_settings.respectDigest) ? false : _settings.respectDigest;


//set up a store that uses google cloud storage - we can reuse the regular cache servive crusher for this
_settings.store = new cGcsStore.GcsStore()
// make sure that goa is using an account with enough privilege to write to the bucket
.setAccessToken(_settings.tokenService())
// set this to the bucket you are using as a property store
.setBucket(_settings.bucketName)
// gcsstore maintains expiry time data to not return objects if the expire
// this avoids complaining about objects in the store that don't have such data
// this allows you to use this to write
.setExpiryLog(false)
// you can use this to segregate data for different projects/users/scopes etc as you want
// need to cldean up the prefix too to normalize in case folder definitions are being used
.setFolderKey(_settings.prefix.replace(/^\/ /, ''))
// no need to compress as crusher will take care of that - no point in zipping again
.setDefaultCompress(false);

checkStore();
// you can set a default expiry time in seconds, but since we're allowing crusher to manage expiry, we don't want expiry to do it too
// however we do want it to self clean, so we can use lifecycle management
// (it's actually a number of days) - so just set it to the day it expires
if (_settings.expiry) store.setLifetime(Math.ceil(_settings.expiry / 24 / 60 / 60));


// we can just re-use the cache plugin service as cloud storage library has same methods.
return new CrusherPluginCacheService().init({
store: _settings.store,
chunkSize: _settings.chunkSize,
respectDigest: _settings.respectDigest,
uselz: _settings.uselz
})

};

}
gcs plugin

Links

bmCrusher

library id: 1nbx8f-kt1rw53qbwn4SO2nKaw9hLYl5OI3xeBgkBC7bpEdWKIPBDkVG0

Github: https://github.com/brucemcpherson/bmCrusher

Scrviz: https://scrviz.web.app/?repo=brucemcpherson/bmCrusher

cGcsStore

library id: 1w0dgijlIMA_o5p63ajzcaa_LJeUMYnrrSgfOzLKHesKZJqDCzw36qorl

Github: https://github.com/brucemcpherson/cGcsStore

Scrviz: https://scrviz.web.app/?repo=brucemcpherson/cGcsStore

cGoa

library id: 1v_l4xN3ICa0lAW315NQEzAHPSoNiFdWHsMEwj2qA5t9cgZ5VWci2Qxv2

Githubhttps://github.com/brucemcpherson/cGoa

Scrvizhttps://scrviz.web.app/?repo=brucemcpherson/cGoa

How fast can you get OAuth2 set up in Apps Script