Get data from the old sheets


NOTE:  with the new Google Sheets, this method has been largely deprecated. There is a complete revamp of this here Get Data From Google Docs. You can also check out the Sheets apps from Apps built with data abstraction, which can both read and write from Excel to Sheets. 


With the growing use of Google Docs and other cloud based systems, we face the problem of how to easily get data from, for example, a Google Docs spreadsheet into Excel. PC based spreadsheets and document processing is not going away time soon. 

This article pulls together a whole bunch of other work on the Excel, VBA and Google Ramblings site such as Excel JSON conversionData Abstraction ExampleGeneral Google visualization toolEmbedding Google Motion Charts in Excel and Regular Expressions.  You can to download a fully working example, as the ability to do this has been incorporated as  method of the cDataSet class and can be found in the cDataSet.xlsm downloable workbook, and also in googleImport.xlsm

Note: this method (for public workbooks only), has largely been superseded by Google docs import with oauth2 which can import entire workbooks, and can import both public private workbooks.
 

Google Wire Protocol

Google has implemented something they call the 'wire protocol' to provide data between Google Docs, Gadgets and so on. It's  jSon like in structure, but is used to describe data tables, and would be provided as an http response to an request to an appropriately structured http request. For this article we will use some data in this GoogleDocs Spreadsheet, which I have been using to demonstrate Roadmap Generation both in VBA and javascript versions.

The data source URL

In Google Docs, if you create a table gadget for example, examining the tables properties will give you the data source URL that will provoke a wire protocol response. In the case of the example sheet, you will see that the URL given on the table gadget is

The response to that request looks like this.

google.visualization.Query.setResponse ({version:'0.6',status:'ok',sig:'810325856', table:{cols:[{id:'A',label:'Activate',type:'date',pattern:'M/d/yyyy'} ,{id:'B',label:'Deactivate',type:'date',pattern:'M/d/yyyy'} ,{id:'C',label:'Description',type:'string',pattern:''} ,{id:'D',label:'ID',type:'string',pattern:''} ,{id:'E',label:'Target',type:'string',pattern:''} ,{id:'F',label:'Custom',type:'string',pattern:''} ,{id:'G',label:'Cost',type:'number',pattern:'#0.###############'},{id:'H',label:'Callout' ,type:'string',pattern:''} ,{id:'I',label:'sequence',type:'string',pattern:''}] ,rows:[{c:[,{v:new  Date(2007,5,20),f:'6/20/2007'},{v:'Cellphone'},{v:'cell'},{v:'smart'},{v:''},{v:200.0,f:'200'},{v:''},{v:''}]}, {c:[{v:new Date(2006,0,6),f:'1/6/2006'},{v:new Date(2011,2,21), f:'3/21/2011'},{v:'Mac'},{v:'mac'},{v:'iPad'},{v:''},{v:300.0,f:'300'},{v:''},{v:''}]}, {c:[,{v:new Date(2012,2,20),f:'3/20/2012'} ,{v:'PC'},{v:'pc'},{v:'android'},{v:''},{v:250.0,f:'250'},{v:''}, {v:''}]},{c:[{v:new Date(2009,0,1),f:'1/1/2009'},{v:new Date(2011,10,6),f:'11/6/2011'},{v:'Netbook'},{v:'net'},{v:'pc'},{v:'dislike'},{v:150.0,f:'150'},{v:'this was a waste of money'},{v:''}]}, {c:[{v:new Date(2006,0,1),f:'1/1/2006'},{v:new Date(2010,1,1),f:'2/1/2010'},{v:'Windows Phone'},{v:'smart'},{v:'iphone'},{v:''},{v:240.0,f:'240'},{v:'Windows Mobile was so bad'},{v:''}]} ,{c:[{v:new Date(2010,8,2),f:'9/2/2010'}, ,{v:'Tablet'},{v:'tablet'},{v:''},{v:'family'},,{v:''},{v:''}]},{c:[{v:new Date(2003,9,28),f:'10/28/2003'},{v:new Date(2008,5,12),f:'6/12/2008'},{v:'MP3 player'},{v:'mp3'},{v:'ipod'},{v:''}, {v:20.0,f:'20'},{v:''},{v:''}]}, {c:[{v:new Date(2004,9,29),f:'10/29/2004'},{v:new Date(2009,0,1),f:'1/1/2009'},{v:'Personal Video'},{v:'pv'},{v:'ipod'},{v:''},{v:25.0,f:'25'},{v:''},{v:''}]}, {c:[{v:new Date(2008,3,11),f:'4/11/2008'},{v:new Date(2011,5,1), f:'6/1/2011'},{v:'iPod'},{v:'ipod'},{v:'ipad'},{v:'like'},{v:25.0,f:'25'},{v:''},{v:''}]},{c:[{v:new Date(2009,0,4),f:'1/4/2009'},, {v:'iPhone'},{v:'iphone'},{v:'android'},{v:'like'},{v:360.0,f:'360'}, {v:''},{v:''}]},{c:[{v:new Date(2010,10,1),f:'11/1/2010'}, ,{v:'iPad'},{v:'ipad'},{v:'tablet'},{v:'like'},{v:480.0,f:'480'},{v:''},{v:''}]},{c:[{v:new Date(2009,11,12),f:'12/12/2009'}, ,{v:'Android/Chrome'},{v:'android'}, {v:''},{v:''},{v:400.0,f:'400'}, {v:''},{v:''}]} ,{c:[{v:new Date(2005,11,2),f:'12/2/2005'}, {v:new Date(2007,11,11),f:'12/11/2007'},{v:'windows tablet'},{v:'wtab'},{v:'pc'},{v:'dislike'},,{v:'this really sucked'},{v:''}]}]}});

Our task then, is to decode this, and populate an excel sheet on demand with this kind of data. 

Data Conversion and Spreadsheet Population

Conversion of jSon to Excel is covered here, however with Google Wire protocol we are not exactly looking at jSon, and neither is the structure a simple two dimensional table that the capability we have already developed is expecting to find. In order to fully utilize all the components that have already been developed, the data needs some preparation.
  • Leverage the cBrowser object to deal with the http request/response.
  • Extract out the {table:[]} part of the google wire response
  • Convert it to a more jSon friendly ( change the new Date() javascript construct to a proper date, 'quote' the key names etc). We'll use Regular Expressions for that, then deserialize it using Excel JSON conversion
  • Crush it down into a 2 dimensional table. We'll need to write new something for that.
  • Use the Data Abstraction Example to populate our Excel sheet.
These classes also use a number of Excel additional references which are already selected in the example worksheet. These are the references that are needed.


Implementation example

This is the module that will do the work. Likely a real implementation would provide some kind of dialogue to collect the request URL, and the output range which are the only parameters needed, and then execute the data transfer and conversion. In this example, the data in the example Google Docs Sheet will be copied to the Excel worksheet name 'Clone'. If you just want to use this as is, then you are good to go. Just create a module of a similar structure as below. Read on further to find out how it all works, or if you want to harden up the code against data structure errors. 

Option Explicit
Const Url = _
"https://spreadsheets.google.com/a/mcpher.com/spreadsheet/tq? range= A1:H14&key=0At2ExLh4POiZdHBJYnlwaERpYTRZSWl1VEw1TEpQVkE& gid=0"
Public Sub googleWireExample()
    Dim dSet As cDataSet, dsClone As cDataSet, jo As cJobject, cb As cBrowser
    Dim sWire As String
    ' get the google wire string
    Set cb = New cBrowser
    sWire = cb.httpGET(Url)
    ' load to a dataset
    Set dSet = New cDataSet
    With dSet
        .populateGoogleWire sWire, Range("Clone!$a$1")
        
        If .Where Is Nothing Then
            MsgBox ("No data to process")
        Else
            ' it worked
            
        End If
    End With
    Set dSet = Nothing
    Set cb = Nothing
End Sub

The http request/response

This is rather straightforward at least, using the already available cBrowser object.

    ' get the google wire string
    Set cb = New cBrowser
    sWire = cb.httpGET(Url)

httpGet method

Public Function httpGET(fn As String) As String
    pHtml = fn
    Dim oHttp As Object
    Set oHttp = CreateObject("Microsoft.XMLHTTP")
    Call oHttp.Open("GET", pHtml, False)
    Call oHttp.Send("")
    httpGET = oHttp.ResponseText
    Set oHttp = Nothing
End Function

This gets us the google wire string containing the spreadsheet data

The populateGoogleWire module

This is a method of the cDataSet class, and can be used to abstract data from Google Wire Protocol, and dump it to a sheet. 


Extracting the table[]

As with most of the code in this example, we need to still add a bunch of error handling, so please beef that up if you implement any of these modules.  For now we are assuming that the wire string is valid and contains a data table. 

  Public Function populateGoogleWire(sWire As String, rstart As Range, _
        Optional wClearContents As Boolean = True, _
        Optional stopAtFirstEmptyRow As Boolean = True) As cDataSet
    Dim jo As cJobject, s As String, p As Long, e As Long, joc As cJobject, jc As cJobject, jr As cJobject, cr As cJobject
    Dim jt As cJobject, v As Variant, astring As Variant
    Dim jStart As String
    
    jStart = "table:"
    p = InStr(1, sWire, jStart)
    'there have been multiple versions of wire ...
    If p = 0 Then
        'try the other one
        jStart = q & ("table") & q & ":"
        p = InStr(1, sWire, jStart)
    End If


    ' take a google wire string and apply it to a range
    p = InStr(1, sWire, jStart)
    e = Len(sWire) - 1

    If p <= 0 Or e <= 0 Or p > e Then
        MsgBox " did not find table definition data"
        Exit Function
    End If
    
    If Mid(sWire, e, 2) <> ");" Then
        MsgBox ("incomplete google wire message")
        Exit Function
    End If
    ' encode the 'table:' part to a cjobject
    p = p + Len(jStart)
    s = "{" & jStart & "[" & Mid(sWire, p, e - p - 1) & "]}"
    ' google protocol doesnt have quotes round the key of key value pairs,
    ' and i also need to convert date from javascript syntax new Date()
    s = rxReplace("(new\sDate)(\()(\d+)(,)(\d+)(,)(\d+)(\))", s, "'$3/$5/$7'")
    s = rxReplace("(\w+)(:)", s, "'$1':")
    ' this should return an object as follow
    ' {table:[ cols:[c:[{id:x,label:x,pattern:x,type:x}] , rows:[ c:[(v:x,f:x}] ]}
    Set jo = New cJobject
    Set jo = jo.deSerialize(s, eDeserializeGoogleWire)
    'need to convert that to cdataset:[{label:"x",,,},{},,,]
    'column labels can be extracted then from jo.child("1.cols.n.label")  .. where 'n'= column number

    Set joc = New cJobject
    Set cr = joc.init(Nothing, cJobName).AddArray
    For Each jr In jo.Child("1.rows").Children
        With cr.add
            For Each jc In jo.Child("1.cols").Children
                Set jt = jr.Child("c").Children(jc.ChildIndex)
                ' sometimes there is no "v" if a null value
                If Not jt.ChildExists("v") Is Nothing Then
                    Set jt = jt.Child("v")
                End If
                If jc.Child("type").toString = "date" Then
                    ' month starts at zero in javascript
                    astring = Split(jt.toString, "/")
                    If LBound(astring) <= UBound(astring) Then
                       v = DateSerial(CInt(astring(0)), CInt(astring(1)) + 1, CInt(astring(2)) + 1)
                    Else
                       v = Empty
                    End If
                Else
                    v = jt.Value
                End If
                .add jc.Child("label").toString, v
            Next jc
        End With
    Next jr
    Set populateGoogleWire = populateJSON(joc, rstart, wClearContents, stopAtFirstEmptyRow)
    
End Function

Cleaning up the jSon

The Wire Protocol needs a bit of work to fix quoting of key names and regularization of date format, but it's going to be a piece of cake with the regEx capability. 

    ' google protocol doesnt have quotes round the key of key value pairs,
    ' and i also need to convert date from javascript syntax new Date()
    s = rxReplace("(new\sDate)(\()(\d+)(,)(\d+)(,)(\d+)(\))", s, "'$3/$5/$7'")
    s = rxReplace("(\w+)(:)", s, "'$1':")

Deserializing the cleaned up jSon

Now we can get to a cJobject we can work with.

    ' this should return an object as follow
    ' {table:[ cols:[c:[{id:x,label:x,pattern:x,type:x}] , rows:[ c:[(v:x,f:x}] ]}
    Set jo = New cJobject
    Set jo = jo.deSerialize(s, eDeserializeGoogleWire)

Crush down to 2 dimensional table

As you can see the wire protocol is a bit verbose, and will have been deserialized into a cJobject with too many layers, so our challenge here is now to move to something 2 dimensional that we can represent in Excel. 


'need to convert that to cdataset:[{label:"x",,,},{},,,]
    'column labels can be extracted then from jo.child("1.cols.n.label")  .. where 'n'= column number

    Set joc = New cJobject
    Set cr = joc.init(Nothing, cJobName).AddArray
    For Each jr In jo.Child("1.rows").Children
        With cr.add
            For Each jc In jo.Child("1.cols").Children
                Set jt = jr.Child("c").Children(jc.ChildIndex)
                ' sometimes there is no "v" if a null value
                If Not jt.ChildExists("v") Is Nothing Then
                    Set jt = jt.Child("v")
                End If
                If jc.Child("type").toString = "date" Then
                    ' month starts at zero in javascript
                    astring = Split(jt.toString, "/")
                    If LBound(astring) <= UBound(astring) Then
                       v = DateSerial(CInt(astring(0)), CInt(astring(1)) + 1, CInt(astring(2)) + 1)
                    Else
                       v = Empty
                    End If
                Else
                    v = jt.Value
                End If
                .add jc.Child("label").toString, v
            Next jc
        End With
    Next jr

Push abstracted data to the target worksheet

Now that we have a cJobject in the correct structure, we can use the cDataSet.populateJSON (already covered here) method to push the data to the target worksheet

Set populateGoogleWire = populateJSON(joc, rstart, wClearContents)


When serialized this simplified jSon structure now would be

{"cDataSet":[{"Activate":"","Deactivate":"6/21/2007", "Description":"Cellphone","ID":"cell","Target":"smart", "Custom":"","Cost":"200","Callout":""},{"Activate":"1/7/2006","Deactivate":"3/22/2011", "Description":"Mac","ID":"mac","Target":"iPad","Custom":"", "Cost":"300","Callout":""},{"Activate":"", "Deactivate":"3/21/2012", "Description":"PC","ID":"pc","Target":"android","Custom":"", "Cost":"250","Callout":""},{"Activate":"1/2/2009","Deactivate":"11/7/2011", "Description":"Netbook","ID":"net","Target":"pc", "Custom":"dislike","Cost":"150","Callout":"this was a waste of money"}, {"Activate":"1/2/2006","Deactivate":"2/2/2010", "Description":"Windows Phone","ID":"smart","Target":"iphone", "Custom":"","Cost":"240","Callout":"Windows Mobile was so bad"},{"Activate":"9/3/2010","Deactivate":"", "Description":"Tablet","ID":"tablet","Target":"", "Custom":"family","Cost":"","Callout":""},{"Activate":"10/29/2003","Deactivate":"6/13/2008", "Description":"MP3 player","ID":"mp3", "Target":"ipod", "Custom":"","Cost":"20","Callout":""},{"A
ctivate":"10/30/2004", "Deactivate":"1/2/2009", "Description":"Personal Video","ID":"pv","Target":"ipod", "Custom":"","Cost":"25","Callout":""},{"Activate":"4/12/2008","Deactivate":"6/2/2011", "Description":"iPod","ID":"ipod", "Target":"ipad","Custom":"like","Cost":"25","Callout":""},{"Activate":"1/5/2009","Deactivate":"", "Description":"iPhone","ID":"iphone", "Target":"android","Custom":"like","Cost":"360", "Callout":""},{"Activate":"11/2/2010","Deactivate":"", "Description":"iPad","ID":"ipad", "Target":"tablet","Custom":"like","Cost":"480","Callout":""},{"Activate":"12/13/2009","Deactivate":"", "Description":"Android/Chrome","ID":"android", "Target":"","Custom":"","Cost":"400","Callout":""},{"Activate":"12/3/2005", "Deactivate":"12/12/2007", "Description":"windows tablet","ID":"wtab", "Target":"pc","Custom":"dislike", "Cost":"","Callout":"this really sucked"} ]
}


Comments