ScraperWiki API
I’ve been taking a look at scraperWiki lately. In case you haven’t come across it, it’s a framework to allow you to scrape structured data from web sites using various data manipulation tools and code. One of the great things about it is that many of the data sets scraped have been made public by their authors. Another good thing is that scraperWiki has an API, which means that all this data can be accessed by the REST library to load them directly into either Google Apps Script or Excel. I’ll implement a general connector for scraperWiki shortly that will get data associated with chosen scraperWiki short_name, but first, here’s an API entry to get a table of what scraperWiki definitions are already available.
In addition to the datasets that others have published, you can of course use scraperWiki to create your own. This means that with scraperWiki, you can get data any web site, even if they don’t actually have an API.
The scraperwiki rest API is single query API, populating multiple rows in a spreadsheet from one query. You just name the columns to match any data you want to retrieve and go. Here’s the results of query for the first 1000 scraperwiki entires. This example can be found in the cDataSet.xlsm and downloaded from here
Library entry
With .add("scraperWiki") .add "restType", erRestType.erSingleQuery .add "url", "https://api.scraperwiki.com/api/1.0/scraper/search?format=jsondict&maxrows=" .add "results", "" .add "treeSearch", False .add "ignore", "" End With
Public Sub testScraperWiki() generalQuery "scraperWiki", "scraperWiki", "1000" End Sub
and for Google apps script
function testScraperWiki() { mcpher.generalQuery ("scraperWiki", "scraperWiki", "1000"); }
Next Step
In a future post, I’ll show how to get data associated with a scraperwiki into Excel and GAS.
For more stuff like this, visit the ramblings site or the associated blog. If you have suggestion for particular API, contact me on our forum