Large document performance

One of the concerns I have about add-ons is that the there is no context maintained between invocations. It's the same thing with custom functions. What that means is that the document or spreadsheet structure needs to be rebuilt each time, using something like.

var doc = DocumentApp.openById(id);

In a normal Apps Script function, the document object persists throughout the function so there no need to keep opening it. With an add-on, you have to open it each time. With a several hundred page document this can take minutes - performance like this is simply not viable. 

I decided to look into it to see if I could figure out what was going on. 

The test

Performing repeated actions on a growing document and timing how long each of them take to see if there is an inflection point at which the performance suddenly degrades and if so then what is it. Running the tests from an HtmlService and comparing it against a baseline running the same thing with a persisted document should provide some insight. 

The results

At first I was a little disappointed, because I noticed that the time to open a document was increasing proportionately to the growing size of the document (and in any case was under a second). It wasn't increasing exponentially like the overall run time - so it wasn't going to be that easy to track down what was causing this.




Delay

I noticed that there was an exponential delay between request and the open action.


But the delay was between the end of the previous cycle and before the document open start. In other words, the request to open the document was taking longer to 'be received', but it wasn't clear what was happening during that delay - which could be close to a minute for a document with 1mb content.

Saving and closing.

My first thought was that saving and closing the document at the end of each cycle might help to reduce the delay in responding to the next cycle (although since the request didn't know it was going to process a document, it's not clear how).  However that process of saving is in itself exponential. As the document content reaches 1mb, it takes over a minute to close.


However the good news is that the delay between request and action has gone  (implying that there is some kind of implicit document save happening at each google.script.run request).

What's going on?

I'm now starting to suspect this is not completely related to the size of the file, but more about how many times its called. This time I'll clear out the document between each cycle, so each cycle's document should be roughly the same size - and in theory should suffer the same delay - but that is more or less steady, so it is related to the size of the document.

Comparing the request time

A timestamp in the request can be compared against the time  the request gets received to confirm it is indeed processing something in between the time the call is sent and received. .
google.script.run.performanceTest (packet, new Date().getTime());

this is received and logged in the stats package. 
function performanceTest (package,initiated) {
  return PerformanceTest.createChunk(
    '1MR7QCbasRsAWRCeEc2baeNfyZBr2ZpMPsXxvNAHFfB4',
    package,
    initiated
  ); 
}

The lag (the time between the execution request and it starting) is not exponential, whereas the delay (the time between the previous cycle finishing and the next one starting) is. This pretty much points to there being a non sequential implicit save and close happening outside the flow of the script during which time nothing else can happen. 



Conclusion

Assuming this is what's supposed to happen, it makes add-ons involving large files pretty much impractical. One approach could be to always do a save and close at the end of some add-on action in the hope that another one will not be required before it's finished. 

The code

The process is controlled from a webapp, which repeatedly provokes a performance test, dumping the results at the end
  <script>
    window.onload = function () {

      var MAX_CYCLE = 20 ;
      var packet, cycleCount = 0;
      addChunks();
    
      function addChunks () {
        //create new chunks in the document and see how long it takes as the doc gets bigger
        if (cycleCount < MAX_CYCLE) {
          google.script.run
        
          .withFailureHandler( function (err) {
             throw err;
          })
        
          .withSuccessHandler ( function (data) {
             cycleCount ++;
             packet = data;
             addChunks();
          })
        
          .performanceTest (packet);
        }
        else {
          // dump result
          google.script.run.dumpResult (packet);

        }
      }
    };
  </script>  

The performance test

This adds some random data to a table and appends it to the document, in each case measuring how long operations like opening, reading text and so on take. The data is random to minimize any skew that might be to do with some kind of compression or caching. This example 
has the save and close enabled.

function performanceTest (package) {
  return PerformanceTest.createChunk('1MR7QCbasRsAWRCeEc2baeNfyZBr2ZpMPsXxvNAHFfB4',package);
}

function dumpResult (package) {
  // write the data to the sheet
  new cSheetExec.SheetExec()
    .sheetOpen ('1Higg488LQnUzuvfXphD0fTaOqqpdAPDbwlWfokZYPyk','opentest')
    .clearContent()
    .setData (package);
}

var PerformanceTest = (function(pt) {
  'use strict';
  
  var MAX = 200, MIN = 50, ROWS = 40, COLUMNS = 10;
  
  function action_ ( packet , action , argOb , measure, func ) {
    
    var start = new Date().getTime();
    var result = func (argOb);

    packet.push ( {
      action:action,
      start:start,
      elapsed:new Date().getTime() - start,
      measure:measure
    });
    return result;
  }
  
    
  /**
   * create a bunch of pages
   */
  pt.createChunk = function (docId,packet) {
    
    if (!packet) {
      // first in, clear the document
      DocumentApp.openById (docId).getBody().clear();
    }
    var packet = packet || []; 
    
    action_ (packet , 'cycle', null, 0 , function () {
      // open a document
      var doc = action_ (packet , 'open' , docId , 0, function (id) {
        return DocumentApp.openById (id);
      }); 
      var openPacket = packet[packet.length-1];
      
      // get the body
      var body = action_ (packet , 'body' , null , 0 ,function () {
        return doc.getBody(); 
      }); 
      
      // get the body as text before adding some more
      var text = action_ (packet , 'pretext' , null , 0 ,function () {
        return body.getText(); 
      });
      openPacket.measure = text.length;
      
      var para = 'Writing another table';
      var elem = action_ (packet , 'paragraph' , para, para.length  , function (stuff) {
        return body.appendParagraph(stuff); 
      });
      
      // add some stuff
      var stuff = cUseful.getRandomSheetStrings(ROWS,COLUMNS,MAX,MIN);
      var elem = action_ (packet , 'chunk' , stuff, 0  , function () {
        return body.appendTable(stuff); 
      });
      
      // get the body as text
      var postText = action_ (packet , 'posttext' , null , 0 ,function () {
        return body.getText(); 
      });
      // add the size of the text now
      packet[packet.length-1].measure = postText.length;
      
      // and on the previous
      packet[packet.length-2].measure = postText.length - text.length;
      
      // close
      action_ (packet , 'saveandclose' , null, postText.length  , function () {
        doc.saveAndClose();
      });
      
    
    });
    return packet;
  };


  return pt;
  
})(PerformanceTest || {});

For more like this, see Google Apps Scripts snippets. Why not join our forumfollow the blog or follow me on twitter to ensure you get updates when they are available. 

You want to learn Google Apps Script?

Learning Apps Script, (and transitioning from VBA) are covered comprehensively in my my book, Going Gas - from VBA to Apps script, available All formats are available now from O'Reilly,Amazon and all good bookshops. You can also read a preview on O'Reilly

If you prefer Video style learning I also have two courses available. also published by O'Reilly.
Google Apps Script for Developers and Google Apps Script for Beginners.



Comments