I just saw a great post by Tony Hirst on his oUseful blog where he nicely laid out how to use Open refine plus various APIs to research which candidates were standing for election in the wards in which they live. This kind of mashup of various tools is something I cannot resist playing around with, so I ‘borrowed’ his scraper wiki data to show how to do the same thing in one shot in Excel. I gave myself a time limit of 2 hours to see how much I could get done, including writing it up.
This is pretty straightforward since I already have all the raw material I need from previous posts concerning the rest/Excel library. I’ve pushed it all in one module so the process is automated from start to finish,
First – the scraperwiki data.
Tony has already created a database which i’m going to pick up using a scraperwiki query – the same one I posted about on Excel Liberation some months ago. Here’s how to do that.
worksheetName = "questionElection" scraperName = "iw_poll_notices_scrape" ' get data from Tony's scraperwiki and populate sheet With scraperWikiStuff(scraperName, worksheetName) Set ds = New cDataSet ds.load worksheetName End With
Now we have an excel sheet with all that scraper wiki data loaded – it starts like this
Extract the postcodes and add a couple of extra columns
' add extra columns With lastCell(ds.headingRow.where) .Offset(, 1).value = "postcode" .Offset(, 2).value = "in ward" End With ds.tearDown ' repopulate with new columns Set ds = New cDataSet With ds.load(worksheetName) ' extract post code For Each dr In ds.rows a = Split(dr.toString("address"), ",") If arrayLength(a) > 0 Then dr.cell("postcode").value = Trim(CStr(a(UBound(a)))) End If Next dr .bigCommit .tearDown End With
See if the address is in the ward
mySociety API can take a postcode and return all sorts of government classifications, so we need to to create a restlibrary entry for that API. It looks like this
With .add("my society") .add "restType", erQueryPerRow .add "url", "http://mapit.mysociety.org/postcode/" .add "results", "areas" .add "treeSearch", True .add "ignore", vbNullString .add "alwaysEncode", True End With
Lookup mySociety and extract ward name for given postcode
Now we’ve defined the API, we can just reference it and let it know which column contains the post code to analyze. The data returned contains the name in an area object of type “UTE”, so it’s just a question of looking at the UTE entry returned by the API for the postcode and comparing it to the ward
' use mysociety api to get ward info ' these options will not bother trying to populate With restQuery(worksheetName, "my society", , "postcode", _ , , , False, False) ' check for jobjects of type UTE n = 0 For Each job In .jObjects n = n + 1 inWard = "out" If Not job.childExists("areas") Is Nothing Then For Each joc In job.child("areas").children If Not joc.childExists("type") Is Nothing Then If joc.child("type").value = "UTE" Then ' we have the right type, check name matches If makeKey(joc.child("name").value) = _ makeKey(.dSet.value(n, "ward")) Then inWard = "in" Exit For End If End If End If Next joc ' mark whether its in our out .dSet.cell(n, "in ward").value = inWard End If Next job .dSet.bigCommit .tearDown End With
Results
Next steps
Complete code
Public Sub ousefulMashup() ' thanks to tony hirst for the data and method. ' http://blog.ouseful.info/2013/05/05/questioning-election-data-to-see-if-it-has-a-story-to-tell/ Dim ds As cDataSet, dr As cDataRow, a As Variant, _ worksheetName As String, scraperName As String, _ job As cJobject, joc As cJobject, inWard As String, _ n As Long worksheetName = "questionElection" scraperName = "iw_poll_notices_scrape" ' get data from Tony's scraperwiki and populate sheet With scraperWikiStuff(scraperName, worksheetName) Set ds = New cDataSet ds.load worksheetName End With ' add extra columns With lastCell(ds.headingRow.where) .Offset(, 1).value = "postcode" .Offset(, 2).value = "in ward" End With ds.tearDown ' repopulate with new columns Set ds = New cDataSet With ds.load(worksheetName) ' extract post code For Each dr In ds.rows a = Split(dr.toString("address"), ",") If arrayLength(a) > 0 Then dr.cell("postcode").value = Trim(CStr(a(UBound(a)))) End If Next dr .bigCommit .tearDown End With ' use mysociety api to get ward info ' these options will not bother trying to populate With restQuery(worksheetName, "my society", , "postcode", _ , , , False, False) ' check for jobjects of type UTE n = 0 For Each job In .jObjects n = n + 1 inWard = "out" If Not job.childExists("areas") Is Nothing Then For Each joc In job.child("areas").children If Not joc.childExists("type") Is Nothing Then If joc.child("type").value = "UTE" Then ' we have the right type, check name matches If makeKey(joc.child("name").value) = _ makeKey(.dSet.value(n, "ward")) Then inWard = "in" Exit For End If End If End If Next joc ' mark whether its in our out .dSet.cell(n, "in ward").value = inWard End If Next job .dSet.bigCommit .tearDown End With End Sub