A recent social network analysis project meant I had to find a way to convert Google Form data for visualization in NodeXL. Normally I’d use D3 for this kind of thing (you’ll find plenty of d3 stuff on this site), but I needed a way to feed NodeXL instead. GraphML is a lightweight XML-based file format for graphs supported by a wide range of graph visualization and analysis tools, including NodeXL – so that’s what I selected to implement.
Markup languages
It turns out that GraphML is just a markup language like HTML, XML and so on, and since I often need to create those things from Apps Script/JavaScript I thought it would be handy to generalize it so it could be used for any of those, and as usual, share it as an Apps Script library.
Let’s get right into some examples, using GraphML.
Getting started
Add a reference to the latest bmXml library at
1WlC-eOf-d3krlXxVSjn0XSp2tUrRhD2cqf7eqyJzO8l7mNGH2O082yX5
If you’ve used my libraries lately, you’ll know I favor an Exports namespace to organize and control access, so first create a script file called Exports like this.
Examples
Normally you’d be creating objects and/or JSON data to describe the data you want to convert as part of your app, but for the sake of illustration, we’ll just create some fake stuff in this examples.
Nodes
Lets’ start by marking up some nodes. In the context of graph visualization, nodes (sometimes called vertices) are the items that the graph will later show the connection between. So for example, in a Twitter visualization, each node might refer to a person.
The resuls isn’t a valid GraphML file yet, but we’ll keep adding stuff to it as we work through these tests.
<graphml xmlns="http://graphml.graphdrawing.org/xmlns">
<node id="a"/>
<node id="b"/>
<node id="c"/>
</graphml>
Attrs
Markup attributes or (attrs) are those that appear as parameters within the opening tag. GML defines the opening tag like this
{
tag: 'node',
attrs: {
id: "a"
}
}
renders as
<node id="a"/>
Edges
Let’s add some edges to the markup. An edge is a connection between 2 nodes, so clearly the markup attrs will at least have to specify the nodes being connected by that edge.
The result isn’t a complete graphml specification, but here’s how the edges render
<graphml xmlns="http://graphml.graphdrawing.org/xmlns">
<edge source="a" target="b"/>
<edge source="a" target="c"/>
<edge source="b" target="c"/>
</graphml>
Graphs
A graph is defined minimally by the nodes and edges that it should visualize. Let’s create a graph using the nodes and edges already defined, like this.
Now we have a viable GraphML file.
<graphml xmlns="http://graphml.graphdrawing.org/xmlns">
<graph id="G1" edgedefault="undirected">
<node id="a"/>
<node id="b"/>
<node id="c"/>
<edge source="a" target="b"/>
<edge source="a" target="c"/>
<edge source="b" target="c"/>
</graph>
</graphml>
Children
An element can contain other elements. We can specify these elements using the children property. Children should bespecified as an array of child elements or text. Since we’ve lumped nodes and edges together as children of graph, they’ll all appear at the same indentation level.
Mandatory attrs
In GraphML, a graph element must always have an edgedefault attr – in this case ‘undirected’. A directed edge shows the direction of the connection between nodes (for example a ‘follows’ b), whereas an undirected edge has no directional implication. The edgedefault is the value to apply when there is no specifc instruction on an edge element.
Keys
Another type of element in GraphML is a key. A key element is a way to associate additional data with node (vertices) and edges. Keys are children of the entire GraphML spec, as a spec could contain multiple graphs, and a key could apply to nodes or edges from more than one graph. You’d generally use keys to add formatting information to nodes or edges.
Here’s the render now, with those keys added.
<graphml xmlns="http://graphml.graphdrawing.org/xmlns">
<key id="V-Color" for="node" attr.name="Color" attr.type="string"/>
<key id="V-Shape" for="node" attr.name="Shape" attr.type="string"/>
<key id="E-Color" for="edge" attr.name="Color" attr.type="string"/>
<graph id="G1" edgedefault="undirected">
<node id="a"/>
<node id="b"/>
<node id="c"/>
<edge source="a" target="b"/>
<edge source="a" target="c"/>
<edge source="b" target="c"/>
</graph>
</graphml>
The ‘for’ attribute indicates whether the key applies to edges or nodes. Keys are referenced by adding children with ‘data’ tags and a key attribute that match the id of the key.
Using Keys with Data tag
Let’s assign those keys to the nodes and edges.
Now the GraphML spec is complete, an is rendered like this
<graphml xmlns="http://graphml.graphdrawing.org/xmlns">
<key id="V-Color" for="node" attr.name="Color" attr.type="string"/>
<key id="V-Shape" for="node" attr.name="Shape" attr.type="string"/>
<key id="E-Color" for="edge" attr.name="Color" attr.type="string"/>
<graph id="G1" edgedefault="undirected">
<node id="a">
<data key="V-Shape">Sphere</data>
</node>
<node id="b">
<data key="V-Shape">Sphere</data>
</node>
<node id="c">
<data key="V-Shape">Sphere</data>
</node>
<edge source="a" target="b">
<data key="V-Color">red</data>
</edge>
<edge source="a" target="c">
<data key="V-Color">red</data>
</edge>
<edge source="b" target="c">
<data key="V-Color">red</data>
</edge>
</graph>
</graphml>
Next
In the next posts on this topic I’ll show you how to create your own renderer so you can process any kind of XML type markup – for example HTML (see Markup HTML from JSON with Apps Script and JavaScript), and also implement a real life graph example from Sheets data.
Links
bmXml library – 1WlC-eOf-d3krlXxVSjn0XSp2tUrRhD2cqf7eqyJzO8l7mNGH2O082yX5 (or github)
testBmXml – the examples in this article (or github)