Here’s where it gets really interesting. Let’s say that we need to get some large number of spreadsheets from Google Docs. We want to do it asynchronously, and we also know that google docs will fail if it receives to many requests, or it’s too busy. So we have to orchestrate and deal with all of the following.
- Getting the schema to know which sheets exists
- Getting some unknown number of sheets and populating the respective Excel sheets when they return
- Throttle failures on Google Docs. We’ll use exponential backoff and SetTimer and VBA to deal with that
- Ensuring that things don’t slip out of memory when the calling procedure exits.
- JSON and Google Wire deciphering to interpret the returned data
The calling app
Public Sub asynchDocs()
Dim url As String, key As String
key = "0At2ExLh4POiZdFd0YUhpZVRPUGxFcW85X2xkMm1vY2c"
url = "https://spreadsheets.google.com/feeds/worksheets/" + key + "/public/basic?alt=json"
' kick this off in the background
doDocsWithBackoff url
'
' we can go and do something else now
Debug.Print "Im doing something else - have a nice day"
End Sub
The promise orchestrator
Public Sub doDocsWithBackoff(url As String)
Dim b As cBackoff, p As cPromise, prs As cPromise, _
process As cDocsImporter, doneSchema As cDeferred, doneSheets As cDeferred
' always to this for a fresh start - it will clear up any memory/timers from previously
clearRegister
' this class knows how to get google spreadsheet
Set process = New cDocsImporter
Set doneSchema = New cDeferred
' first phase, we get the schema, and release up memory
Set b = New cBackoff
b.execute(url) _
.done(process, "getSchema", doneSchema) _
.fail process, "show"
' although this will be done synchronously, using a promise means we can know its done
Set prs = deleteAllTheSheets
' we'll have resolved doneSchema when the previous .done is executed
' we'll need another deferred to tell us when all this is over
Set doneSheets = New cDeferred
when(Array(prs, doneSchema.promise)) _
.done(process, "getSheets", doneSheets, url) _
.fail process, "show"
' now we can clear everything up
doneSheets.promise _
.done(register, "teardown") _
.done(process, "teardown") _
.fail process, "show"
End Sub
clearRegister
' always to this for a fresh start - it will clear up any memory/timers from previously
clearRegister
callBack class
One of the limitations of VBA is that you can only callback class members. I recommend that you centralize all specific methods needed for a particular project in a class like this – in my case I’ve created a class called cDocsImporter that will know all about how to handle google Docs Data. ' this class knows how to get google spreadsheet
Set process = New cDocsImporter
getting the schema
Here’s the first deferred class – doneSchema. It ‘s promise will be used as a signal that we’ve managed to get (or failed to get) schema data from Google Docs. This is the definition of all the sheets in the Spreadsheet. Set doneSchema = New cDeferred
getting schema data using cBackoff
The cBackoff class knows how to get data asynchronously – it uses Promises in VBA too – and also how to deal with throttling using an exponential backoff algorithm. For that it needs to use SetTimer and VBA. The cBackoff will kick off retrieval data from the given url, and immediately give control back. It returns a promise.
When that gets resolved (within the cBackoff class), it will execute b.done() by calling process.getSchema(). The data retrieved by the cBackoff class will get passed to process.getSchema() also. If the cBackoff class is rejected (meaning it failed to get data), it will execute b.fail() by calling process.show() to report whatever the cause of the failure was.
Finally it will resolve (or reject) doneSchema. We can use doneSchema.promise() as a signal shortly to know that we’ve finished getting the schema
' first phase, we get the schema, and release up memory
set b = New cBackoff
b.execute(url) _
.done(process, "getSchema", doneSchema) _
.fail process, "show"
using promises for synchronous activities
The use of promises is not limited to asynchronous activities. You can use it to detect whether something has happened successfully or failed. You can also start synchronously, but later convert to asynchronous without having to modify the code structure. In this case we need to execute something that Excel doesnt know how to do asynchronously – delete all the sheets in the workbook. Since we have a lull in activities whilst waiting for the schema to come back, this is the perfect time. ' although this will be done synchronously, using a promise means we can know its done
Set prs = deleteAllTheSheets
Private Function deleteAllTheSheets() As cPromise
' its possible to use promises for synchronous tasks too
Dim d As cDeferred
Set d = New cDeferred
' dont want complaints
Application.DisplayAlerts = False
' delete all but one sheet
While sheets.Count > 1
sheets(1).Delete
Wend
Application.DisplayAlerts = True
d.resolve (Array("sheets deleted"))
Set deleteAllTheSheets = d.promise
End Function
The reason we use a promise here is that next we’ll want to retrieve each of the spreadsheets from withing the Google Workbook. However we don’t want to start doing that until both the schema has finished being retrieved and the old sheets have been deleted. It is almost certain that the sheet deletion would be finished being executed before the Schema was returned anyway, but we can use both promises now before continuing.
Using When
You can test that multiple things have completed using when(). In our case we want to check that both the sheets are finished being deleted and the schema has been retrieved before starting to get data back.
' we'll have resolved doneSchema when the previous .done is executed
' we'll need another deferred to tell us when all this is over
Set doneSheets = New cDeferred
when(Array(prs, doneSchema.promise)) _
.done(process, "getSheets", doneSheets, url) _
.fail process, "show"
When actually returns a promise. This means that you can use .done() and .fail(). All the promises need to be successful for when().done() to execute. If any one fails, then when().fail() will be executed instead. The promises to be evaluated are presented as an array(promise1,promise2,…promisen). When() will receive a signal when each of those promises is either resolved or rejected. In our case we want both of array(prs, doneSchema.promise()) to be successfully resolved before continuing. We’ll also need another promise. This time to signal that we’ve got all the data. process.getSheets() will resolve (or reject) doneSheets when all the data has been retrieved and the sheets populated. Aside from taking multiple promises, When() will pass on the data from the last promise in the array – in this case the data that was signalled by doneScheme.resolve(data).
Clearing up
' now we can clear everything up
doneSheets.promise _
.done(register, "teardown") _
.done(process, "teardown") _
.fail process, "show"
It is good practice to create a .tearDown method for classes in order to properly remove it from memory. In many cases, setting its reference to Nothing will not work (where classes have dual links to each other), leading to a memory leak. In our case, we’ve managed to retain them in memory by creating a link to them in a public register. Register.teardown() will not only remove the link, but also execute each objects teardown method (if it has one). Doing this at the end of everything will recover the memory used during all the asynchronous activity.
The cDocsImporter class
Option Explicit
Public Sub getSchema(a As Variant, Optional defer As cDeferred, Optional args As Variant)
' this will be called when I get the schema.
' the schema will be the 2nd argument of a(), 1st will be the url
Dim c As cJobject
Set c = getJsonDataFromArray(a)
' register it so it gets cleaned up later
keepInRegister c, "schema", "cJobject"
' we'll pass back the parsed schema
defer.resolve (Array(c))
End Sub
Public Sub storeData(a As Variant, Optional defer As cDeferred, Optional args As Variant)
' this will be called each time I have some data from the vizAPI
' the schema will be the 2nd argument of a(), 1st will be the url
Dim c As cJobject, job As cJobject, jr As cJobject, jc As cJobject, _
s As String, joc As cJobject, r As Range, w As Worksheet
Set c = getJsonDataFromArray(a, eDeserializeGoogleWire)
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
' add a worksheet
Set w = Worksheets.add()
w.name = CStr(args(LBound(args)))
Set r = w.Cells(1, 1)
'
'put the data
'
Set jr = c.find("rows")
Set jc = c.find("cols")
' here's the column headings
'
If Not jc Is Nothing Then
For Each job In jc.children
s = job.child("label").value
If s = vbNullString Then
s = job.child("id").value
End If
r.Offset(, job.childIndex - 1).value = s
Next job
End If
'
' and these are the rows
'
If Not jr Is Nothing Then
For Each job In jr.children
For Each joc In job.child("c").children
r.Offset(job.childIndex, joc.childIndex - 1).value = joc.child("v").value
Next joc
Next job
End If
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
' release the cJobject memory
c.tearDown
' signal that all is good
defer.resolve (a)
End Sub
Public Sub getSheets(a As Variant, Optional defer As cDeferred, Optional args As Variant)
' this will be called when I want to get all the data
' the schema will be the 1st argument of a()
'
Dim c As cJobject, cj As cJobject, job As cJobject, _
joc As cJobject, s As String, u() As String, url As String, i As Long, _
d() As cPromise, b As cBackoff, def As cDeferred, testing As Boolean, maxU As Long
' limit processing for debugging by setting this true
testing = False
Set c = a(LBound(a))
Set cj = c.find("feed.entry")
If cj Is Nothing Then
defer.reject (Array("could not find feed urls in schema entries"))
Exit Sub
End If
' these will be the urls for each sheet to get
maxU = cj.children.Count
If (testing) Then maxU = 1
ReDim u(0 To maxU - 1)
ReDim d(0 To maxU - 1)
ReDim t(0 To maxU - 1)
If (arrayLength(u) = 0) Then
defer.reject (Array("there were no sheets in the schema"))
Exit Sub
End If
' extract the vizapi links
For Each job In cj.children
If job.childIndex > maxU Then Exit For
url = vbNullString
For Each joc In job.child("link").children
s = joc.toString("rel")
' we're going to use the link for viz api
If InStr(1, s, "visualizationApi") Then
url = joc.toString("href")
End If
Next joc
If (url = vbNullString) Then
defer.reject (Array("couldnt find the vizapi link in the schema"))
Exit Sub
End If
' store it
u(job.childIndex - 1) = url
If (job.child("title.$t") Is Nothing) Then
defer.reject (Array("couldnt find the tiles node in the schema"))
Exit Sub
End If
t(job.childIndex - 1) = job.toString("title.$t")
Next job
' when we get here, we have to set up a promise for each of the urls and get the data
For i = LBound(u) To UBound(u)
Set b = New cBackoff
Set def = New cDeferred
Set d(i) = def.promise
b.execute(u(i)) _
.done(Me, "storeData", def, Array(t(i))) _
.fail Me, "show"
Next i
'signal the urls have been processed
when(d) _
.done(Me, "resolve", defer) _
.fail Me, "reject", defer
End Sub
Public Sub resolve(a As Variant, Optional defer As cDeferred, Optional args As Variant)
defer.resolve (Array("urls done"))
End Sub
Public Sub reject(a As Variant, Optional defer As cDeferred, Optional args As Variant)
defer.resolve (Array("urls done"))
End Sub
Public Sub tearDown(a As Variant, Optional defer As cDeferred, Optional args As Variant)
End Sub
Private Sub class_initialize()
keepInRegister Me, , "cProcessData"
End Sub
Private Function getJsonDataFromArray(a As Variant, _
Optional t As eDeserializeType = eDeserializeNormal) As cJobject
' this will organize the data returned from a standard httpdeferred
Dim c As cJobject, s As String
Set c = New cJobject
If t = eDeserializeGoogleWire Then
s = cleanGoogleWire(CStr(a(LBound(a) + 1)))
Else
s = CStr(a(LBound(a) + 1))
End If
With c.init(Nothing)
.add "url", CStr(a(LBound(a)))
.add("data").append JSONParse(s, t)
End With
Set getJsonDataFromArray = c
End Function
Public Sub show(a As Variant, Optional defer As cDeferred, Optional args As Variant)
Dim i As Long
Debug.Print "show:";
For i = LBound(a) To UBound(a)
Debug.Print a(i)
Next i
Debug.Print ""
End Sub
getSchema method
Public Sub getSchema(a As Variant, Optional defer As cDeferred, Optional args As Variant)
' this will be called when I get the schema.
' the schema will be the 2nd argument of a(), 1st will be the url
Dim c As cJobject
Set c = getJsonDataFromArray(a)
' register it so it gets cleaned up later
keepInRegister c, "schema", "cJobject"
' we'll pass back the parsed schema
defer.resolve (Array(c))
End Sub
getSheets method
' when we get here, we have to set up a promise for each of the urls and get the data
For i = LBound(u) To UBound(u)
Set b = New cBackoff
Set def = New cDeferred
Set d(i) = def.promise
b.execute(u(i)) _
.done(Me, "storeData", def, Array(t(i))) _
.fail Me, "show"
Next i
Each sheet is retrieved asynchronously using a cBackoff, and its promise is stored in an array of promises. On each resolution, the storeData() method will be called to move the data into the Excel sheet. So there are a whole stream of data fetches and data stores all happening at the same time.
Public Sub getSheets(a As Variant, Optional defer As cDeferred, Optional args As Variant)
' this will be called when I want to get all the data
' the schema will be the 1st argument of a()
'
Dim c As cJobject, cj As cJobject, job As cJobject, _
joc As cJobject, s As String, u() As String, url As String, i As Long, _
d() As cPromise, b As cBackoff, def As cDeferred, testing As Boolean, maxU As Long
' limit processing for debugging by setting this true
testing = False
Set c = a(LBound(a))
Set cj = c.find("feed.entry")
If cj Is Nothing Then
defer.reject (Array("could not find feed urls in schema entries"))
Exit Sub
End If
' these will be the urls for each sheet to get
maxU = cj.children.Count
If (testing) Then maxU = 1
ReDim u(0 To maxU - 1)
ReDim d(0 To maxU - 1)
ReDim t(0 To maxU - 1)
If (arrayLength(u) = 0) Then
defer.reject (Array("there were no sheets in the schema"))
Exit Sub
End If
' extract the vizapi links
For Each job In cj.children
If job.childIndex > maxU Then Exit For
url = vbNullString
For Each joc In job.child("link").children
s = joc.toString("rel")
' we're going to use the link for viz api
If InStr(1, s, "visualizationApi") Then
url = joc.toString("href")
End If
Next joc
If (url = vbNullString) Then
defer.reject (Array("couldnt find the vizapi link in the schema"))
Exit Sub
End If
' store it
u(job.childIndex - 1) = url
If (job.child("title.$t") Is Nothing) Then
defer.reject (Array("couldnt find the tiles node in the schema"))
Exit Sub
End If
t(job.childIndex - 1) = job.toString("title.$t")
Next job
' when we get here, we have to set up a promise for each of the urls and get the data
For i = LBound(u) To UBound(u)
Set b = New cBackoff
Set def = New cDeferred
Set d(i) = def.promise
b.execute(u(i)) _
.done(Me, "storeData", def, Array(t(i))) _
.fail Me, "show"
Next i
'signal the urls have been processed
when(d) _
.done(Me, "resolve", defer) _
.fail Me, "reject", defer
End Sub
storeData method
Public Sub storeData(a As Variant, Optional defer As cDeferred, Optional args As Variant)
' this will be called each time I have some data from the vizAPI
' the schema will be the 2nd argument of a(), 1st will be the url
Dim c As cJobject, job As cJobject, jr As cJobject, jc As cJobject, _
s As String, joc As cJobject, r As Range, w As Worksheet
Set c = getJsonDataFromArray(a, eDeserializeGoogleWire)
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
' add a worksheet
Set w = Worksheets.add()
w.name = CStr(args(LBound(args)))
Set r = w.Cells(1, 1)
'
'put the data
'
Set jr = c.find("rows")
Set jc = c.find("cols")
' here's the column headings
'
If Not jc Is Nothing Then
For Each job In jc.children
s = job.child("label").value
If s = vbNullString Then
s = job.child("id").value
End If
r.Offset(, job.childIndex - 1).value = s
Next job
End If
'
' and these are the rows
'
If Not jr Is Nothing Then
For Each job In jr.children
For Each joc In job.child("c").children
r.Offset(job.childIndex, joc.childIndex - 1).value = joc.child("v").value
Next joc
Next job
End If
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
' release the cJobject memory
c.tearDown
' signal that all is good
defer.resolve (a)
End Sub