Running GmailApp in parallel

Running things in parallel using HTML service was a brief intro on how to run a number of things at once, orchestrating executing using Google Apps Script HTML service. In Some hints on setting up parallel running profiles I showed how to set up some complex profiles. We'll use them as the basis for this example, so you should read that first.

In Summarizing emails to a sheet I showed how to search your mail and write a summary of the results to a sheet using Database abstraction with google apps script. The problem is that Gmail is kind of slow and rate limited, so you end up not being able to get enough message details within the 6 minute deadline. We can use Running things in parallel using HTML service to split the work up and run it all in parallel. Here's how  - (I recommend putting all these kind of thing in the same script since they are all very similar and use the same structure for their functions).

This example is  to demonstrate the principle of how to run things in parallel. With a rate limited service such as this, running more than a couple of threads can be counter productive. For this particular service, you could consider running search (query, start, max) and running it a few times with different parameters. However in the Dealing with rate limited services, you'll find a way of building upon the section below to sequence chunks of parallel workloads that have a rate limited restriction.

The profiles

The first task is to get all the threads matching the search text in your email. We'll write the getTheThreads() function later
function logEmailsProfile() {
  var profile = [];

  // get the matching threads
  var profileThreads = [
      {
        "name": "GET THREADS",
        "functionName": "getTheThreads",
        "skip": false,
        "options": {
          "searchText": "The Excel Liberation forum has moved to a Google+ community"
        }
      }
    ];

Now we need to split the threads into some number of chunks to be worked on in parallel. We'll write the getMessages() function later
  // get and process all the messages
  var CHUNKS = 3;
  var profileMessages = [];
  for (var i =0; i <CHUNKS;i++ ) {
    profileMessages.push ({
        "name": "MESSAGES-"+i,
        "functionName":"getMessages",
        "skip":false,
        "options":{
          index: i,
          threads:CHUNKS,
        }
    });
  }

Next, a reduction to bring all the results together. As usual we'll use the common function reduceTheResults(), that we've used in all the examples for that
  // next reduce the messages to one
  var profileReduction = [];
  profileReduction.push({
    "name": "reduction",
    "functionName":"reduceTheResults",
    "options":{
    }
  });

Finally we'll log all the results in a spreadsheet. We can use a common logTheResults() function for that too
  // finally log the results
  var profileLog = [{
    "name": "LOG",
    "functionName": "logTheResults",
    "skip": false,
    "options": {
      "driver": "cDriverSheet",
      "clear": true,
      "parameters": {
        "siloid": "emails",
        "dbid": "1yTQFdN_O2nFb9obm7AHCTmPKpf5cwAd78uNQJiCcjPk",
        "peanut": "bruce"
      }
    }
  }];

Now we create a single profile to sequence all of those components
  // put it all together
  profile.push (
    profileThreads,
    profileMessages,
    profileReduction,
    profileLog
  );
  return profile;
}

The only change we need to make now is to call this function to set up the run profile
function showSidebar() {
   
   // kicking off the sidebar executes the orchestration
   libSidebar('asyncService',ADDONNAME, logEmailsProfile () );
 
}


The executors

Now we need to write the couple of new functions mentioned in these profiles that will be called by the htmlservice

This one will search the email for threads that match the search term

function getTheThreads(options) {
  return cUseful.rateLimitExpBackoff( function() {
    return GmailApp.search(options.searchText).map(function(d) {
      return d.getId();
    });
  });
}

and this one will be run in parallel, processing chunks of the total messages
/**
 * do a chunk of message processing
 * @param {object} options describes what to do 
 * @param {object} reduceResults this would contain results from a previous stage if present
 * @return {object} test data to pass on to next stage
 */
function getMessages (options,reduceResults) {
  
  // we'll only do a section of data in this thread
  var data = reduceResults[0].results;
  var start = Math.round(options.index/options.threads * data.length);
  var finish = Math.round((options.index+1)/options.threads * data.length) ;
  
  // work with that slice of messages
  return data.slice (start, finish).reduce ( function (p,c) {
    // for later decrypt testing, we'll include everything
    cUseful.rateLimitExpBackoff(function () {
      GmailApp.getThreadById(c).getMessages().forEach(function(d) {
        cUseful.arrayAppend(p, d.getTo().split(",").map(function(e) {
          return {to:e,subject:d.getSubject(),dateSent:d.getDate().toString(),from:d.getFrom()};
        }));
      });
    },1000);
    return p;
  },[]);
  
}

Here's a snap of the run - We got 1026 seconds of processing over 371 seconds, and more importantly - managed to process something we wouldn't have been able to inside of a 6 minute limit


Less is sometimes more

Normally throwing more parallel threads can get things done quicker, but in this case - more threads mean more backoffs due to rate limiting - since with GmailApp, the limiting is for a user. In fact, when we reduce the number of threads to two - it still gets the job done in roughly the same elapsed time, even though there are less threads working on it. This is because of the increased backing off necessary. In fact if you can get away with it  (this one was too large), run it one thread - and use this to synchronize tasks that can run one after the other as a way of avoiding execution time quota problems.


Authorization

GmailApp needs authorization, so its worth running a small test first to force an authorization dialog, as well as to test your executor functions. That can easily be done by emulating a couple of execution steps in a simple function like this.


function smallEmailTest() {
  var messages = getTheThreads({searchText:'something bizarre'});
  Logger.log(getMessages ({index:0,threads:2},[{results:messages}]));
}


For more on this topic, see Running things in parallel using HTML service. For more snippets like this see Google Apps Scripts snippets

For help and more information join our forum,follow the blog or follow me on twitter .
The gadget spec URL could not be found

You want to learn Google Apps Script?

Learning Apps Script, (and transitioning from VBA) are covered comprehensively in my my book, Going Gas - from VBA to Apps script, available All formats are available now from O'Reilly,Amazon and all good bookshops. You can also read a preview on O'Reilly

If you prefer Video style learning I also have two courses available. also published by O'Reilly.
Google Apps Script for Developers and Google Apps Script for Beginners.






Comments