Each night I used to run a scheduled task which took the Google Analytics data for this site when it was on Google Sites and matched it up to the site pages. At the time of writing I had 1000 different pages in analytics to match over 500 pages on my site.

You can take a copy of the code here

Avoiding the 6 minute limit.

As usual, with such a lot of content, I have to worry about the 6 minute execution time limit. There were a number of approaches I could have taken to cut the work into chunks – such as Parallel processing in Apps Script, but this process is a recursive one – and it’s kind of complicated to cut up. The thing that takes most time is to go off and get G+ counts for each page, and domain name variants on that page (for example sites.google.com/a/mcpher.com/abc is actually the same page as ramblings.mcpher.com/abc and www.mcpher.com/abc but they are, sometimes .. but strangely not always, seen as different pages by G+).
So that’s 2000 URL fetches right there. Allow .3 seconds for each and i’ve already blown the limit, not even accounting for having to backoff when I hit URLFetch per second quota.
Luckily caching can come to rescue. I use this method – Extract the plus one counts from a page to get the g+ data, but I run a section of the overall process a few times first, knowing that it will probably run out of execution time. However running it, fills up cache  (about 10 times as fast) with the results , so that when I come to run the real process, all the g+ results will already be in cache and everything can finish comfortably within the quotas.
Here’s the snippet that I run a few times
So my scheduled triggers look like this

Here is what ran less regularly and populated the database with all the statistics

 

See more like this in Displaying analytics data on site pages

For help and more information, follow the blog, join our community or and follow me on Twitter