Here’s where it gets really interesting. Let’s say that we need to get some large number of spreadsheets from Google Docs. We want to do it asynchronously, and we also know that google docs will fail if it receives to many requests, or it’s too busy. So we have to orchestrate and deal with all of the following.
- Getting the schema to know which sheets exists
- Getting some unknown number of sheets and populating the respective Excel sheets when they return
- Throttle failures on Google Docs. We’ll use exponential backoff and SetTimer and VBA to deal with that
- Ensuring that things don’t slip out of memory when the calling procedure exits.
- JSON and Google Wire deciphering to interpret the returned data
All of this will be promise based, which should make the whole thing digestible. As input we’ll use the university tables that were used in the javacript app, University Rankings visualized All code mentioned is in the promises.xlsm workbook, which you can download here.
The calling app
I’ve generalized this so it can be used to retrieve any Google Spreadsheet. Here’s the shell. You can substitute in the key for any Google Spreadsheet. The only thing of consequence here is that the import process is kicked off asynchronously, and control will immediately be returned to the caller, which will then exit. NOTE: with the new Google Sheets, this method has been largely deprecated. There is a complete revamp of this here Get Data From Google Docs. You can also check out the Sheets apps from Apps built with data abstraction, which can both read and write from Excel to Sheets, but this post may be useful since it is about asynchronicity more than the utility of the app itself. Public Sub asynchDocs()
Dim url As String, key As String
key = "0At2ExLh4POiZdFd0YUhpZVRPUGxFcW85X2xkMm1vY2c"
url = "https://spreadsheets.google.com/feeds/worksheets/" + key + "/public/basic?alt=json"
' kick this off in the background
doDocsWithBackoff url
'
' we can go and do something else now
Debug.Print "Im doing something else - have a nice day"
End Sub
The promise orchestrator
Here’s the code that kicks everything off. Let’s walk through Public Sub doDocsWithBackoff(url As String)
Dim b As cBackoff, p As cPromise, prs As cPromise, _
process As cDocsImporter, doneSchema As cDeferred, doneSheets As cDeferred
' always to this for a fresh start - it will clear up any memory/timers from previously
clearRegister
' this class knows how to get google spreadsheet
Set process = New cDocsImporter
Set doneSchema = New cDeferred
' first phase, we get the schema, and release up memory
Set b = New cBackoff
b.execute(url) _
.done(process, "getSchema", doneSchema) _
.fail process, "show"
' although this will be done synchronously, using a promise means we can know its done
Set prs = deleteAllTheSheets
' we'll have resolved doneSchema when the previous .done is executed
' we'll need another deferred to tell us when all this is over
Set doneSheets = New cDeferred
when(Array(prs, doneSchema.promise)) _
.done(process, "getSheets", doneSheets, url) _
.fail process, "show"
' now we can clear everything up
doneSheets.promise _
.done(register, "teardown") _
.done(process, "teardown") _
.fail process, "show"
End Sub
clearRegister
When executing this asynchronously, one of the issues is to stop things disappearing from memory when the module that initiates them goes out of scope. One way is create a tangle of public variables – but that will be hard to manage, even harder to clear out, and impossible when the task is data driven as it is in this case. The promise framework maintains a register of objects that are going to be needed later, and keeps them in memory. This simply clears out any hanging around from previous asynch sessions. It should be the first thing called. ' always to this for a fresh start - it will clear up any memory/timers from previously
clearRegister
callBack class
One of the limitations of VBA is that you can only callback class members. I recommend that you centralize all specific methods needed for a particular project in a class like this – in my case I’ve created a class called cDocsImporter that will know all about how to handle google Docs Data. ' this class knows how to get google spreadsheet
Set process = New cDocsImporter
getting the schema
Here’s the first deferred class – doneSchema. It ‘s promise will be used as a signal that we’ve managed to get (or failed to get) schema data from Google Docs. This is the definition of all the sheets in the Spreadsheet. Set doneSchema = New cDeferred
getting schema data using cBackoff
The cBackoff class knows how to get data asynchronously – it uses Promises in VBA too – and also how to deal with throttling using an exponential backoff algorithm. For that it needs to use SetTimer and VBA. The cBackoff will kick off retrieval data from the given url, and immediately give control back. It returns a promise.
When that gets resolved (within the cBackoff class), it will execute b.done() by calling process.getSchema(). The data retrieved by the cBackoff class will get passed to process.getSchema() also. If the cBackoff class is rejected (meaning it failed to get data), it will execute b.fail() by calling process.show() to report whatever the cause of the failure was.
Finally it will resolve (or reject) doneSchema. We can use doneSchema.promise() as a signal shortly to know that we’ve finished getting the schema
' first phase, we get the schema, and release up memory
set b = New cBackoff
b.execute(url) _ .done(process, "getSchema", doneSchema) _
.fail process, "show"
using promises for synchronous activities
The use of promises is not limited to asynchronous activities. You can use it to detect whether something has happened successfully or failed. You can also start synchronously, but later convert to asynchronous without having to modify the code structure. In this case we need to execute something that Excel doesnt know how to do asynchronously – delete all the sheets in the workbook. Since we have a lull in activities whilst waiting for the schema to come back, this is the perfect time. ' although this will be done synchronously, using a promise means we can know its done
Set prs = deleteAllTheSheets
Here is the deleteallthesheets function. Note that it creates and resolves a promise in line, since the activity is synchronous, and returns a promise. Private Function deleteAllTheSheets() As cPromise
' its possible to use promises for synchronous tasks too
Dim d As cDeferred
Set d = New cDeferred
' dont want complaints
Application.DisplayAlerts = False
' delete all but one sheet
While sheets.Count > 1
sheets(1).Delete
Wend
Application.DisplayAlerts = True
d.resolve (Array("sheets deleted"))
Set deleteAllTheSheets = d.promise
End Function
The reason we use a promise here is that next we’ll want to retrieve each of the spreadsheets from withing the Google Workbook. However we don’t want to start doing that until both the schema has finished being retrieved and the old sheets have been deleted. It is almost certain that the sheet deletion would be finished being executed before the Schema was returned anyway, but we can use both promises now before continuing.
Using When
You can test that multiple things have completed using when(). In our case we want to check that both the sheets are finished being deleted and the schema has been retrieved before starting to get data back.
' we'll have resolved doneSchema when the previous .done is executed
' we'll need another deferred to tell us when all this is over
Set doneSheets = New cDeferred
when(Array(prs, doneSchema.promise)) _
.done(process, "getSheets", doneSheets, url) _
.fail process, "show"
When actually returns a promise. This means that you can use .done() and .fail(). All the promises need to be successful for when().done() to execute. If any one fails, then when().fail() will be executed instead. The promises to be evaluated are presented as an array(promise1,promise2,…promisen). When() will receive a signal when each of those promises is either resolved or rejected. In our case we want both of array(prs, doneSchema.promise()) to be successfully resolved before continuing. We’ll also need another promise. This time to signal that we’ve got all the data. process.getSheets() will resolve (or reject) doneSheets when all the data has been retrieved and the sheets populated. Aside from taking multiple promises, When() will pass on the data from the last promise in the array – in this case the data that was signalled by doneScheme.resolve(data).
Clearing up
Since the procedure will have long exited by now, its up to you to clean up the objects that will have been forced to remain in memory beyond their regular scope. We can detect when everything is completed either by using When() . done() or fail(). In our case, we know that doneSheets.promise() could only have been resolved if everything else had been done, so we can simply test for the doneSheets.promise().done() signal ' now we can clear everything up
doneSheets.promise _
.done(register, "teardown") _
.done(process, "teardown") _
.fail process, "show"
It is good practice to create a .tearDown method for classes in order to properly remove it from memory. In many cases, setting its reference to Nothing will not work (where classes have dual links to each other), leading to a memory leak. In our case, we’ve managed to retain them in memory by creating a link to them in a public register. Register.teardown() will not only remove the link, but also execute each objects teardown method (if it has one). Doing this at the end of everything will recover the memory used during all the asynchronous activity.
The cDocsImporter class
All the hard work interpreting google docs formats is done by calling back various methods in this class. There’s nothing much new in here, json and google docs importing having been covered elsewhere in this site, but here’s how it all fits together, and how the promises are rejected or resolved. Option Explicit
Public Sub getSchema(a As Variant, Optional defer As cDeferred, Optional args As Variant)
' this will be called when I get the schema.
' the schema will be the 2nd argument of a(), 1st will be the url
Dim c As cJobject
Set c = getJsonDataFromArray(a)
' register it so it gets cleaned up later
keepInRegister c, "schema", "cJobject"
' we'll pass back the parsed schema
defer.resolve (Array(c))
End Sub
Public Sub storeData(a As Variant, Optional defer As cDeferred, Optional args As Variant)
' this will be called each time I have some data from the vizAPI
' the schema will be the 2nd argument of a(), 1st will be the url
Dim c As cJobject, job As cJobject, jr As cJobject, jc As cJobject, _
s As String, joc As cJobject, r As Range, w As Worksheet
Set c = getJsonDataFromArray(a, eDeserializeGoogleWire)
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
' add a worksheet
Set w = Worksheets.add()
w.name = CStr(args(LBound(args)))
Set r = w.Cells(1, 1)
'
'put the data
'
Set jr = c.find("rows")
Set jc = c.find("cols")
' here's the column headings
'
If Not jc Is Nothing Then
For Each job In jc.children
s = job.child("label").value
If s = vbNullString Then
s = job.child("id").value
End If
r.Offset(, job.childIndex - 1).value = s
Next job
End If
'
' and these are the rows
'
If Not jr Is Nothing Then
For Each job In jr.children
For Each joc In job.child("c").children
r.Offset(job.childIndex, joc.childIndex - 1).value = joc.child("v").value
Next joc
Next job
End If
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
' release the cJobject memory
c.tearDown
' signal that all is good
defer.resolve (a)
End Sub
Public Sub getSheets(a As Variant, Optional defer As cDeferred, Optional args As Variant)
' this will be called when I want to get all the data
' the schema will be the 1st argument of a()
'
Dim c As cJobject, cj As cJobject, job As cJobject, _
joc As cJobject, s As String, u() As String, url As String, i As Long, _
d() As cPromise, b As cBackoff, def As cDeferred, testing As Boolean, maxU As Long
' limit processing for debugging by setting this true
testing = False
Set c = a(LBound(a))
Set cj = c.find("feed.entry")
If cj Is Nothing Then
defer.reject (Array("could not find feed urls in schema entries"))
Exit Sub
End If
' these will be the urls for each sheet to get
maxU = cj.children.Count
If (testing) Then maxU = 1
ReDim u(0 To maxU - 1)
ReDim d(0 To maxU - 1)
ReDim t(0 To maxU - 1)
If (arrayLength(u) = 0) Then
defer.reject (Array("there were no sheets in the schema"))
Exit Sub
End If
' extract the vizapi links
For Each job In cj.children
If job.childIndex > maxU Then Exit For
url = vbNullString
For Each joc In job.child("link").children
s = joc.toString("rel")
' we're going to use the link for viz api
If InStr(1, s, "visualizationApi") Then
url = joc.toString("href")
End If
Next joc
If (url = vbNullString) Then
defer.reject (Array("couldnt find the vizapi link in the schema"))
Exit Sub
End If
' store it
u(job.childIndex - 1) = url
If (job.child("title.$t") Is Nothing) Then
defer.reject (Array("couldnt find the tiles node in the schema"))
Exit Sub
End If
t(job.childIndex - 1) = job.toString("title.$t")
Next job
' when we get here, we have to set up a promise for each of the urls and get the data
For i = LBound(u) To UBound(u)
Set b = New cBackoff
Set def = New cDeferred
Set d(i) = def.promise
b.execute(u(i)) _
.done(Me, "storeData", def, Array(t(i))) _
.fail Me, "show"
Next i
'signal the urls have been processed
when(d) _
.done(Me, "resolve", defer) _
.fail Me, "reject", defer
End Sub
Public Sub resolve(a As Variant, Optional defer As cDeferred, Optional args As Variant)
defer.resolve (Array("urls done"))
End Sub
Public Sub reject(a As Variant, Optional defer As cDeferred, Optional args As Variant)
defer.resolve (Array("urls done"))
End Sub
Public Sub tearDown(a As Variant, Optional defer As cDeferred, Optional args As Variant)
End Sub
Private Sub class_initialize()
keepInRegister Me, , "cProcessData"
End Sub
Private Function getJsonDataFromArray(a As Variant, _
Optional t As eDeserializeType = eDeserializeNormal) As cJobject
' this will organize the data returned from a standard httpdeferred
Dim c As cJobject, s As String
Set c = New cJobject
If t = eDeserializeGoogleWire Then
s = cleanGoogleWire(CStr(a(LBound(a) + 1)))
Else
s = CStr(a(LBound(a) + 1))
End If
With c.init(Nothing)
.add "url", CStr(a(LBound(a)))
.add("data").append JSONParse(s, t)
End With
Set getJsonDataFromArray = c
End Function
Public Sub show(a As Variant, Optional defer As cDeferred, Optional args As Variant)
Dim i As Long
Debug.Print "show:";
For i = LBound(a) To UBound(a)
Debug.Print a(i)
Next i
Debug.Print ""
End Sub
getSchema method
This will be called when it’s time to extract the schema from the data returned by the cBackoff class. Every method that is likely to be called by a .done() or a .fail() should have the same argument list. a As Variant – this is the data that will have been signalled in .resolve(array(data)) and will be an array of whatever Optional defer As cDeferred – this can be used to signal completion of this task through a promise Optional args As Variant – this is another array that this time can be passed via .done()
Public Sub getSchema(a As Variant, Optional defer As cDeferred, Optional args As Variant)
' this will be called when I get the schema.
' the schema will be the 2nd argument of a(), 1st will be the url
Dim c As cJobject
Set c = getJsonDataFromArray(a)
' register it so it gets cleaned up later
keepInRegister c, "schema", "cJobject"
' we'll pass back the parsed schema
defer.resolve (Array(c))
End Sub
The json of the schema get parsed and creates a cJobject . We register it so that it doesnt slip out of memory, but also so that it can be easily cleaned up later (cjobject has a teardown class that will be executed by register.teardown()). Finally we resolve the work and pass on the cJobject containing the schema data.
getSheets method
Once getSchema (and deleteallthesheets) is resolved, this will be called to go and get each of the sheets referenced in the schema. Much f the code here is simply about converting google wire protocol. The interesting part is this section…. ' when we get here, we have to set up a promise for each of the urls and get the data
For i = LBound(u) To UBound(u)
Set b = New cBackoff
Set def = New cDeferred
Set d(i) = def.promise
b.execute(u(i)) _
.done(Me, "storeData", def, Array(t(i))) _
.fail Me, "show"
Next i
Each sheet is retrieved asynchronously using a cBackoff, and its promise is stored in an array of promises. On each resolution, the storeData() method will be called to move the data into the Excel sheet. So there are a whole stream of data fetches and data stores all happening at the same time.
Public Sub getSheets(a As Variant, Optional defer As cDeferred, Optional args As Variant)
' this will be called when I want to get all the data
' the schema will be the 1st argument of a()
'
Dim c As cJobject, cj As cJobject, job As cJobject, _
joc As cJobject, s As String, u() As String, url As String, i As Long, _
d() As cPromise, b As cBackoff, def As cDeferred, testing As Boolean, maxU As Long
' limit processing for debugging by setting this true
testing = False
Set c = a(LBound(a))
Set cj = c.find("feed.entry")
If cj Is Nothing Then
defer.reject (Array("could not find feed urls in schema entries"))
Exit Sub
End If
' these will be the urls for each sheet to get
maxU = cj.children.Count
If (testing) Then maxU = 1
ReDim u(0 To maxU - 1)
ReDim d(0 To maxU - 1)
ReDim t(0 To maxU - 1)
If (arrayLength(u) = 0) Then
defer.reject (Array("there were no sheets in the schema"))
Exit Sub
End If
' extract the vizapi links
For Each job In cj.children
If job.childIndex > maxU Then Exit For
url = vbNullString
For Each joc In job.child("link").children
s = joc.toString("rel")
' we're going to use the link for viz api
If InStr(1, s, "visualizationApi") Then
url = joc.toString("href")
End If
Next joc
If (url = vbNullString) Then
defer.reject (Array("couldnt find the vizapi link in the schema"))
Exit Sub
End If
' store it
u(job.childIndex - 1) = url
If (job.child("title.$t") Is Nothing) Then
defer.reject (Array("couldnt find the tiles node in the schema"))
Exit Sub
End If
t(job.childIndex - 1) = job.toString("title.$t")
Next job
' when we get here, we have to set up a promise for each of the urls and get the data
For i = LBound(u) To UBound(u)
Set b = New cBackoff
Set def = New cDeferred
Set d(i) = def.promise
b.execute(u(i)) _
.done(Me, "storeData", def, Array(t(i))) _
.fail Me, "show"
Next i
'signal the urls have been processed
when(d) _
.done(Me, "resolve", defer) _
.fail Me, "reject", defer
End Sub
Finally, we can use when() to resolve that we have completely finished all the retrievals, since we have an array of the all the promises that have been issued, each of which will have been resolved when the data from them has been transferred to excel by storeData.
storeData method
This is the final step. Data from each sheet is passed here and converting to a cJobject. When done its promise will be resolved. These promises make up the list that getSheets() is waiting for before finally declaring victory. Public Sub storeData(a As Variant, Optional defer As cDeferred, Optional args As Variant)
' this will be called each time I have some data from the vizAPI
' the schema will be the 2nd argument of a(), 1st will be the url
Dim c As cJobject, job As cJobject, jr As cJobject, jc As cJobject, _
s As String, joc As cJobject, r As Range, w As Worksheet
Set c = getJsonDataFromArray(a, eDeserializeGoogleWire)
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
' add a worksheet
Set w = Worksheets.add()
w.name = CStr(args(LBound(args)))
Set r = w.Cells(1, 1)
'
'put the data
'
Set jr = c.find("rows")
Set jc = c.find("cols")
' here's the column headings
'
If Not jc Is Nothing Then
For Each job In jc.children
s = job.child("label").value
If s = vbNullString Then
s = job.child("id").value
End If
r.Offset(, job.childIndex - 1).value = s
Next job
End If
'
' and these are the rows
'
If Not jr Is Nothing Then
For Each job In jr.children
For Each joc In job.child("c").children
r.Offset(job.childIndex, joc.childIndex - 1).value = joc.child("v").value
Next joc
Next job
End If
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
' release the cJobject memory
c.tearDown
' signal that all is good
defer.resolve (a)
End Sub
Summary
Although conceptually this is a little harder to deal with than if this had been done synchronously, the use of promises helps control the complex asynchronous orchestration. In fact I can’t imagine doing it without it now. See more about Promises in VBA