Some APIS allow batching of requests, but with a maximum number you can do at the same time. When the calls are asynchronous, chunking a request into smaller pieces and putting it all together afterwards can be a little tricky.

An example of the kind of API that has this kind of behavior is the Knowledge graph API. You supply a list of mids (machine-generated identifier), and it returns Knowledge Graph SP matches for the ones it knows. It allows some maximum (I’m not sure how many exactly) of ids in 1 batch.

Splitting the list into chunks means we don’t go over the limit, but it also allows multiple chunks to be done in parallel.

Initializing

To get started, let’s use small wrapper for the Node kgsearch api, and install the googleapis npm package. You can generate an api key in your cloud project (or use a service account). For simplicity I’m using an API key, so you’ll need to get your own and substitute the relevant value. The mode parameter is just how I manage differnt keys for production and development, so you can ignore if you don’t do anything like that.

module.exports = ((ns) => {

  const {google} = require('googleapis');

  const secrets = require('../private/secrets');
  let kgClient = null;

  /**
   * initialize the lnpwlefge grap
   * supply credentials
   */
  ns.init = ({ mode }) => {
    kgClient = kgClient || new google.kgsearch({
      version: 'v1',
      auth: secrets.storage[mode].apiKey,
    });
    return ns;
  };

  ns.search = ({mids}) => {
    return mids && mids.length
      ? kgClient.entities.search({
          ids: mids,
        })
      : Promise.resolve(null);
    };
  return ns;
})({});

Using

In its simplest form, you can now do this

const kgs = require('./kgs').init({mode: 'dev'});
const test = async () => {
  // make up some data
  const mids = ['/m/01bw9x', '/m/0ch6mp2'];
  const result = await kgs.search({mids});
  //
  console.log(JSON.stringify(result.data.itemListElement));
  };

However, let’s make it handle asynchronous chunking – which is the main purpose of this post.

GraphQL

This is actually a resolver for one of my GraphQL server APIS. You can query on loads of mids, and get back the Knowledge Graph result(s) via the API. The ids come through the GraqphQL resolver arguments as an array of strings in rgs.params.ids, so you’ll need to modify the arguments to suit however you send over the ids.

ns.kgraphs = rgs => {
    // have to chunk this up into max
    const { params } = rgs;
    const { ids } = params || {};
    const max = 20;
    // slice into chunks of max
    const slices = (ids || [])
      .map((_, i, a) => i % max ? null : a.slice(i, i + max))
      .filter(f => f);

    // do them all at once
    return Promise.all(slices.map(f => ns.kgGets({ ids: f })))
      .then(results => [].concat.apply([], results))
	  };
Let’s break that down.
No more that 20 ids to be in any chunk
const max = 20;

Make an array of slices of the original list, each of which are at most 20 in length. A simple way to do this is only to create an array on multiples of 20, and return null arrays and filter them out for all other indices.

const slices = (ids || [])
      .map((_, i, a) => i % max ? null : a.slice(i, i + max))
	  .filter(f => f);

We’ll look at the kgGets function later, but it is the one that interacts with the knowledge graph API. This sends off each of the requests simultaneously (I’ll cover how to handle throttling that if necessary in another post), then concatenates all the results into a single result, as if none of that chunking ever happened.

return Promise.all(slices.map(f => ns.kgGets({ ids: f })))
	.then(results => [].concat.apply([], results))

Hitting the API

This is a wrapper for the kgSearch method which I covered at the beginning of the article. You can just use kgSearch instead of kgGets if you just want the vanilla results from the API back, and stop here.

However the knowledge graph API does return a load of stuff that I don’t need for my API, so I’m going to use the ‘pluck-deep’ npm package to pull out what I need. As an added complication, I use the Facebook loader to optimize graphQL resolvers – this requires that the results are exactly the same length and in the same order as the original request (the knowledge graph only returns results for which it has some data), so there’s a little bit of fiddling required here to pack out the arrays again.

ns.kgGets = rgs => {
    const {ids} = rgs;
    // because it fails on blank ids
    const goodIds = ids.filter(f => f);
    return kgs.search({ mids: goodIds })
      .then(result => {
        const r = pluckDeep(result, 'data.itemListElement.result');
        // for the dataloader we need to return an array the same size as the request
        const obs = r.map(f => kgOrganize(f));
        const mapped = ids.map(f => obs.find(g => g.id === f) || null);
        return mapped;
      });
	  };

Organizing the data

This again, won’t be relevant if you just want the vanilla api response, but here’s the data I need for the response plucked out.

const kgOrganize = (rob => {
    const ob = rob
      ? {
          // because the id returned has this thing in front of it
          id: pluckDeep(rob, '@id').replace(/^kg:/,''),
          wiki: pluckDeep(rob, 'detailedDescription.url'),
          articleBody: pluckDeep(rob, 'detailedDescription.articleBody'),
          license: pluckDeep(rob, 'detailedDescription.license'),
          types: pluckDeep(rob, '@type'),
          url: pluckDeep(rob, 'url'),
          imageUrl: pluckDeep(rob, 'image.url'),
          imageContentUrl: pluckDeep(rob, 'image.contentUrl'),
          name: rob.name,
        }
      : null;

    return ob;
	});
Since G+ is closed, you can now star and follow post announcements and discussions on github, here