Orchestration of Apps Scripts - parallel threads and defeating quotas

Over the years I’ve had a few goes at this – for example Running things in parallel using HTML service and Parallel processing in Apps Script, so I’ve decided to do a bigger and better version of those.

Code is on github

Page Content hide

5 Share with your network

Quotas

We’ve all come across the problem of the run time limit, trigger timing accuracy, cache size limitations and so on, all of which makes running significantly sized jobs difficult. Splitting the job up into chunks is possible of course, but then bringing all the data back together and storing it somewhere in the meantime is a problem. A large job will also use up so much memory that the Server side JavaScript will run very slowly or not at all.

GasThreader

is my latest solution to these conundrums and has these features

Run stages in sequence
Automatically split stages of jobs into manageable chunks that can be run in parallel.
Watch progress on a webapp dashboard.
Persist all the results Server side, and have access to any previous stage.
Automatically reduce the results of parallel chunks.
Be restartable (not yet, but soon)

This post consists of multiple parts, so if you want to get right to it –

Dashboard

The job is to retrieve almost 1m rows from a Fusion table, analyze and manipulate the data, and optimize sql scripts so it can be imported into a Google Spanner or CockroachDb SQL database. This amount of data is not even loadable into Apps Script in the regular run time limit – in other words I can’t even get to stage 3.

Here’s a couple of screenshots from a large run, in the middle of running, with some notes.

The Getting data stage has been split into 24 “Threads”, all running the same script “Orchestration.getData”, in parallel.
The light yellow blobs are threads currently in progress, and the dark yellow are threads that have completed so far.
“Items in” shows the number of items that need to be processed by each stage, and “Items out” are how many items were created.
There are implicit split, map and reduce operation associated with each stage
The data is never passed over to the client (in this case it would be too large anyway), and is persisted Server side using compression and cache record linking (Zipping to make stuff fit in cache or properties service.) to overcome cache quota limits.

Here we are a little further along, the blue shows a reduce operation in progress

And finally, the job is complete

The final output is a zip file containing a set of sql files to optimized to bring this data in to cockroachDB, which I’ll use over on my Getting cockroachdb running on google cloud platform page.

Just over 4 minutes to run 9 stages on the data from 1m rows from a Fusion table. 4 of the stages had parallel threads, so in fact there were 84 script instances running here. 2 hours of processing in 4 minutes!

Why not join our forum, follow the blog or follow me on Twitter to ensure you get updates when they are available.

Going GAS made it to the Best Google Books of All Time

BookAuthority collects and ranks the best books in the world, and it is a great honor to get this kind of recognition. Thank you for all your support! [...]
Going GAS Video

An accelerated video course over about 8 hours and 70 lessons taking you through JavaScript and Apps Script from start to finish. [...]
Video Course for Beginners

A video course over about 8 hours and 70 lessons taking you through the basics of Apps Script and JavaScript [...]
Going GAS Book

For those who prefer book and eBook formats, a 450+ page deep dive into Apps Script. Especially useful for those transitioning from another platform. [...]
Live Project

Create, Implement and test a AI Powered Google Workspace Add-Ons. add functionality from a collection of Google Cloud APIs such as Cloud Storage, CardService, Drive and others [...]

Get analytics profiles in a sheet
If you followed Do something useful with GAS in 5 minutes, you’ll already know how to use a spreadsheet as a database. Here’s how to use …
Google Docs
Most of the material concerning Google Apps Script on this site is about Google Spreadsheets. You can also use GAS to extend Google Documents. There …
Named Ranges with Column Headings
Let’s take the concept of named ranges a little further. Its true that the TABLE capabilities in later versions of excel give you much more …
Updating parse.com
We’ve looked at Query Limits on parse.com and Google Visualization API data to prepare for loading the color table from Playing around with GAS color into parse.com. Now to load the …
VizMap Spot Settings
This relates to Data Driven Mapping applications [...]

bruce mcpherson is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Based on a work at http://www.mcpher.com. Permissions beyond the scope of this license may be available at code use guidelines