Dealing with objects that are too large for the property or cache store

The Property store has a limit of 9k on individual property items and, the cache store has 100k. You may also be using other stores for this kind of data with similar limitations. Let's take a look at how to minimize that problem.

Compression

One way is to stringify the object, compress the result using zip, then converting the zip blob to base 64 so you can save it as a string. This will typically get you savings of about 60-70%, so that's a good thing anyway. But eventually you'll come across a case that the compressed object is still bigger than the limit.

Writing a master property and dependent chunks

Once you've created your compressed stringified object, and it still doesn't fit, in place of the data being stored against your key, you can instead store an array of other keys pointing to other store entries which contain the the compressed data cut up into chunks. When you access the master, it will go off and find the contributing properties to reconstitute the original data.

Chunking

The cUseful library has a function that manages all of this for you. Here's the key, and it's also on github

1EbLSESpiGkI3PYmJqWh3-rmLkYKAtCNPi1L2YCtMgo2Ut8xMThfJ41Ex


Here's how to use it, first with the PropertiesService. The chunksize is the maximum amount of data that can be written to the store. For testing I've just set it to 500, although the Apps Script Properties Service maximum is 9k in real life.
  var propertyCrusher = new cUseful.Squeeze.Chunking ()
  .setStore (PropertiesService.getScriptProperties())
  .setChunkSize(500);  // actually this would be 9k in real life

Writing a property value

It compresses the object, and takes care of creating additional property entries if it finds it needs them. Even though you access entries using 'yourKey', it is not actually the key that's written to the  store, but instead derivatives of it are.  Accessing its records directly won't do you any good.  Data written with Chunking needs to be read back with the Chunking  so that it takes care of uncompressing and joining together any additional chunk entries it's had to create.The size returned is the total size of  all the compressed object chunks written. 
var size = crusher.setBigProperty ("yourKey",objectToWrite);

Reading a property value

You'll get back your reconstituted and parsed object like this.

var yourObject = crusher.getBigProperty ("yourKey");


Removing a property value

This gets rid of an entry, plus any dependent entries.

crusher.removeBigProperty ("yourKey");

Keeping the store clean

If you already have a value against a key, it maybe that this is stored across several entries. When you replace it with a new value, the number of entries for the new value may be different than that required for the previous value. When you use setBigProperty, Chunking automatically removes all the previous entries for that key after inserting your new value. This avoids having a build up of redundant dependent entries.

Multi user access

When the record is compressed and fits on a single record, there is no difference than with the standard uncompressed version. Since the property value can be potentially spread across multiple store entries, there is a chance that an update operation is in progress when that same property is accessed by someone else. In the case of an update, for example, the old entries are deleted and new ones created. I'm not using the LockService  (I could also have used Using named locks with Google Apps Scripts), since they both bring significant overhead into operations like these. To minimize the chance of multi user issues when the property value needs to spread across multiple entries, these mitigations are in place

  • dependent entries each have a onetime unique key, which are stored in the master record, so there is no chance of mixing up entries from one update with another.
  • dependent entries are deleted after the master entry, so if the master entry was findable, then so should each of its dependents.
  • dependent entries are created before the master entry, so if the master entry exists, then so will its dependents.
Whether single or multiple entries are used should be invisible to you, and you can use your regular mitigations in your app to avoid clashes such as the Lock Service if you need to.

The digest

In order to avoid rewriting values that haven't changed, Chunking maintains a digest of the contents of the current property value in the master entry. When you use setBigProperty, this digest is used to check if the value you are setting is the same as the value already written. If it is, then there's no need for the removal of any previous dependent properties. You can know this has happened if setBigProperty returns 0 for the size of data it has written.

This behavior can be changed, if you always want to write a new value, even if it hasn't changed, using the setRespectDigest method, as shown below. An example of why you might want to do this could be if you were using the CacheService, and wanted an update to refresh the expiration time.

.setRespectDigest(false)

What's in the store

Normally of course, you wouldn't need to care - just as you don't care what the inside of a zip file looks like, but here's how it works.

If the compressed version of a value is able to fit within the chunksize, then there's only one record It looks something like this.

{"chunk":"UEsDBBQACAgIALEkTEkAAAAAAAAAAAAAAAAMAAAAVW50aXRsZWQudHh0dcqxDkAwFAXQf7lzB60W6a+IgRZBgmifRfrvEttL9MznQRxDhG0f9LCFwACLhSryRA4CDlbqulK1UsY0WifxRcnjnZ+Kz93na8mrn9b81fxebj7z2fBMwxaOv92lF1BLBwiu1JPhXgAAAA8BAABQSwECFAAUAAgICACxJExJrtST4V4AAAAPAQAADAAAAAAAAAAAAAAAAAAAAAAAVW50aXRsZWQudHh0UEsFBgAAAAABAAEAOgAAAJgAAAAAAA==","digest":"+tK559w4FADiEzR0ir5sXbxLrn0="}

If it doesn't fit, then the master entry contains information about where to get all the chunks and looks like this. The chunks object contains links to other property values, which when combined, will reconstitute the compressed value. It also contains the digest for the entire value which is used to avoid unnecessary setting of values when they haven't actually changed. 

{"chunks":["chunking_key9_iu6twdtsqma","chunking_key9_iu6twduscpd"],"digest":"J+Se4udvDpRc6oNjbefoIfXPKcY="}

and the last of of the chunk entries looks like this.

{"chunk":"DAAAAAAAAAAAAAAAAAAAAAAAVW50aXRsZWQudHh0UEsFBgAAAAABAAEAOgAAAFIBAAAAAA=="}

Using a different store

By default, Chunking is expecting to use a properties store, but it doesn't have any dependencies built in, and you can plug in your own functions to take care of the writing and reading to a different store. 

Here's how to set it up to use the CacheService instead of the PropertiesService

  var cacheCrusher = new cUseful.Squeeze.Chunking ()
  .setStore (CacheService.getUserCache())
  .setChunkSize(1000)   // actually this can be 100k in real life
  .funcWriteToStore(function (store, key , str) {
    return cUseful.Utils.expBackoff(function () { 
        return store.put (key , str); 
    });
  })
  .funcReadFromStore(function (store, key) {
    return cUseful.Utils.expBackoff(function () { 
        return store.get (key); 
    });
  })
  .funcRemoveObject(function (store, key) {
    return cUseful.Utils.expBackoff(function () { 
        return store.remove (key); 
    });
  });

The methods starting with .func, allow you to redefine how to write, read and remove objects from a store. Since the CacheService use different methods than the property service, they need to be redefined as above. The default PropertiesService versions of functions that can be modified to handle different kinds of stores are shown below. You can reset any of these with the .func... pattern.
 // how to get an object
    var getObject_ = function (store , key) {
      var result = readFromStore_ (store, key );
      return result ? JSON.parse (result) : null;
    };
    
    // how to set an object
    var setObject_ = function (store,  key , ob) {
      var s = JSON.stringify(ob || {});
      writeToStore_ ( store , key, s  );
      return s.length;
    };
    
    // how to write a string
    var writeToStore_ = function ( store, key, str) {
      return Utils.expBackoff(function () { 
        return store.setProperty (key , str); 
      });
    };
    
    // how to read a string
    var readFromStore_ = function (store, key) {
      return Utils.expBackoff(function () { 
        return store.getProperty (key); 
      });
    };
    
    // how to remove an object
    var removeObject_ = function (store, key) {
      return Utils.expBackoff(function () { 
        return store.deleteProperty (key);
      });
    };

Testing

Here's a test writing various random data, reading it back and making sure its the same. The compression gives 60-70% saving using these test parameters and the properties service and cache service appear to give similar performance. Here's the results

[16-10-12 12:50:11:538 BST] 0- original: 5125 : chunked: 1724: compression:66.36%
[16-10-12 12:50:12:010 BST] 1- original: 5303 : chunked: 1764: compression:66.74%
[16-10-12 12:50:12:420 BST] 2- original: 4035 : chunked: 1400: compression:65.3%
[16-10-12 12:50:12:794 BST] 3- original: 3762 : chunked: 1340: compression:64.38%
[16-10-12 12:50:13:290 BST] 4- original: 8297 : chunked: 2548: compression:69.29%
[16-10-12 12:50:13:931 BST] 5- original: 8297 : chunked: 2536: compression:69.43%
[16-10-12 12:50:14:427 BST] 6- original: 6253 : chunked: 2000: compression:68.02%
[16-10-12 12:50:14:981 BST] 7- original: 6682 : chunked: 2088: compression:68.75%
[16-10-12 12:50:15:372 BST] 8- original: 4875 : chunked: 1616: compression:66.85%
[16-10-12 12:50:15:748 BST] 9- original: 3154 : chunked: 1176: compression:62.71%
[16-10-12 12:50:15:835 BST] properties took 4661
[16-10-12 12:50:16:177 BST] 0- original: 3849 : chunked: 1332: compression:65.39%
[16-10-12 12:50:16:611 BST] 1- original: 3849 : chunked: 1320: compression:65.71%
[16-10-12 12:50:17:039 BST] 2- original: 6162 : chunked: 1896: compression:69.23%
[16-10-12 12:50:17:460 BST] 3- original: 4270 : chunked: 1428: compression:66.56%
[16-10-12 12:50:17:943 BST] 4- original: 9348 : chunked: 2716: compression:70.95%
[16-10-12 12:50:18:427 BST] 5- original: 5022 : chunked: 1600: compression:68.14%
[16-10-12 12:50:18:855 BST] 6- original: 4405 : chunked: 1456: compression:66.95%
[16-10-12 12:50:19:287 BST] 7- original: 5125 : chunked: 1632: compression:68.16%
[16-10-12 12:50:19:726 BST] 8- original: 4693 : chunked: 1544: compression:67.1%
[16-10-12 12:50:20:174 BST] 9- original: 3154 : chunked: 1144: compression:63.73%
[16-10-12 12:50:20:326 BST] cache took 4490

and the code.
function tryChunking () {

  // using property service 
  var propertyCrusher = new cUseful.Squeeze.Chunking ()
  .setStore (PropertiesService.getScriptProperties())
  .setChunkSize(500);  // actually this can be 9k in real life
  
  var start = new Date().getTime();
  doTest (propertyCrusher);
  Logger.log('properties took ' + (new Date().getTime() - start));

  
  // using cache service
  var cacheCrusher = new cUseful.Squeeze.Chunking ()
  .setStore (CacheService.getUserCache())
  .setChunkSize(1000)   // actually this can be 100k in real life
  .funcWriteToStore(function (store, key , str) {
    return cUseful.Utils.expBackoff(function () { 
        return store.put (key , str); 
    });
  })
  .funcReadFromStore(function (store, key) {
    return cUseful.Utils.expBackoff(function () { 
        return store.get (key); 
    });
  })
  .funcRemoveObject(function (store, key) {
    return cUseful.Utils.expBackoff(function () { 
        return store.remove (key); 
    });
  });
  
  var start = new Date().getTime();
  doTest (cacheCrusher);
  Logger.log('cache took ' + (new Date().getTime() - start));

  
  function doTest (crusher) {
    for (var j=0;j<10;j++) {
      // generate some random stuff
      for (var i=0,yourOb={test:[]},max=cUseful.Utils.randBetween(50,200);i<max;i++){
        yourOb.test.push ({a:i,b:cUseful.Utils.generateUniqueString(i % 10),c:new Date().getTime()});
      }
      
      // save to prop store
      var size = crusher.setBigProperty ('key'+j,yourOb);
      var originalSize = JSON.stringify(yourOb).length;
      
      Logger.log(j + '- original: ' +
                 originalSize + ' : chunked: ' +
                 size + ': compression:' +
                 Math.round(10000*(1-(size/originalSize)))/100 + "%" );
      
      // get all the data
      var y = crusher.getBigProperty ('key'+j);
      
      // make sure it worked
      if(JSON.stringify(yourOb) !== JSON.stringify(y)) throw 'failed on size ' + size + ' loop ' + j;
    }
  }
  
}

The cUseful library contains this and many other things posted around this site.

Here's the key, and it's also on github

1EbLSESpiGkI3PYmJqWh3-rmLkYKAtCNPi1L2YCtMgo2Ut8xMThfJ41Ex





For more like this, see Google Apps Scripts snippets. Why not join our forum, follow the blog or follow me on twitter to ensure you get updates when they are available.






Comments