Hadley Wickham has added some data management capabilities to R which are pretty useful for getting data ready for tabulation or visualization. One of these is melt(). Melt can do lots of things, but one that caught my eye in particular was the ability to turn a tabular data representation in to a transactional one – a kind of unpivot. In other words to turn this
into this
Tony Hirst pointed this out in one of his posts about R, and it got me thinking that mashing up something like this (a kind of unpivot table) in both VBA and Google Apps Script would be pretty straightforward, since I have all the components already.
Usage
So lets start with how to use it. At a minimum we need to say something like this.
Public Sub testMelt() reshapeMelt "{'outputSheet':'meltOut','id':['id','time']}" End Sub
This looks like a strange combination of javaScript and VBA – and I guess it is. I’m trying to both minimize the work to make a google apps script version of everything so using jSon for specifying options is a good way to average things out. If you need to recap, I cover jSon arguments in Excel in this blog post.
reshapeMelt actually returns a cDataSet of the output data, so it can be used directly to chain to other procedures that know how to process these. I’ll show and example of this in a later post.
Options
All options are specified in one jSon string (or as close to one as I can get within the confines of VBA syntax). The full range of options are as below. I’ll no doubt be adding to this as a template for all default options for this topic.
{'complain':true, 'inputSheet':'meltOut', 'variableColumn' : 'variable', 'valueColumn' : 'value', 'id':['id'] , 'outputSheet': 'rOutputData' , 'clearContents':true}
Walkthough
The arguments work like jQuery $.extend()
in the sense that any arguments you do specify simply override the defaults. In this case “{'outputSheet':'meltOut','id':['id','time']}
” xx outputSheet specifies where to put the result, and the ID area specifies the columns that need to be combined to produce a unique key for each generated transaction – just like in the rshape addon – melt(mydata, id=c("id","time"))
The code
Most of this is just patching together existing capabilities – mainly cJobject and cDataSet. I had to add a few bits and pieces to cJobject but nothing major. I’ll post the Google Apps version shortly.
Public Function reshapeMelt(options As String) As cDataSet ' this is a very basic start at vba implementation of Hadley Wickhams melt(R) 'http://www.statmethods.net/management/reshape.html Dim jArgs As cJobject, ds As cDataSet, cj As cJobject, _ r As Range, ws As Worksheet, dr As cDataRow, dsOut As cDataSet, _ dc As cCell, dsre As cDataSet ' sort out the options Set jArgs = optionsExtend(options, rOptionDefaults) ' check for argument programming syntax error Debug.Assert Not jArgs Is Nothing With jArgs If .toString("inputsheet") = .toString("outputsheet") Then MsgBox ("Reading and writing to the same sheet - not allowed") Exit Function End If End With ' read input sheet Set ds = New cDataSet If ds.populateData _ (wholeSheet(jArgs.toString("inputsheet")), , , , , , True) Is Nothing Then Exit Function End If ' check we have everything we need With jArgs For Each cj In .child("id").children If Not ds.headingRow.validate(.cValue("complain"), cj.toString) Then Exit Function End If Next cj ' check if output sheet exists? Set ws = sheetExists(.toString("outputSheet"), .cValue("complain")) If ws Is Nothing Then Exit Function End If ' good to go Set r = ws.Cells(1, 1) If .cValue("clearContents") Then ws.Cells.ClearContents End If ' make headings For Each cj In .child("id").children r.value = cj.value Set r = r.Offset(, 1) Next cj r.value = .toString("variableColumn") r.Offset(, 1).value = .toString("valueColumn") ' lets get that in a dataset for abstracted column access Set dsOut = New cDataSet dsOut.populateData ws.Cells.Resize(1, r.column + 1) ' now data Set r = dsOut.headingRow.Where.Offset(1).Resize(1, 1) For Each dr In ds.rows For Each dc In dr.columns ' need to generate a new row for each non ID cell If .child("id").valueIndex _ (ds.headings(dc.column).toString) = 0 Then ' the id fields For Each cj In .child("id").children r.Offset(, dsOut.headingRow.exists(cj.toString).column - 1).value = dr.value(cj.toString) Next cj ' this variable value r.Offset(, _ dsOut.headingRow.exists(.toString("valueColumn")).column - 1).value _ = dc.value ' and its name r.Offset(, _ dsOut.headingRow.exists(.toString("variableColumn")).column - 1).value _ = ds.headings(dc.column).value Set r = r.Offset(1) End If Next dc Next dr End With ' send back what we just did Set dsre = New cDataSet Set reshapeMelt = dsre.populateData(dsOut.headingRow.Where.Resize(r.row - 1)) End Function
Setting up the default options for re-use
Public Function rOptionDefaults() As String ' this sets up the defaults for all R related stuff rOptionDefaults = _ "{'complain':true, 'inputSheet':'" & ActiveSheet.name & "'," & _ "'variableColumn' : 'variable', 'valueColumn' : 'value', 'id':['id'] ," & _ "'outputSheet': 'rOutputData' , 'clearContents':true}" End Function
Modify jSon args to go as many levels deep as required for optional arguments
Public Function optionsExtend(givenOptions As String, _ Optional defaultOptions As String = vbNullString) As cJobject Dim jGiven As cJobject, jDefault As cJobject, _ jExtended As cJobject, cj As cJobject ' this works like $.extend in jQuery. ' given and default options arrive as a json string ' example - ' optionsExtend ("{'width':90,'color':'blue'}", "{'width':20,'height':30,'color':'red'}") ' would return a cJobject which serializes to ' "{width:90,height:30,color:blue}" Set jGiven = jSonArgs(givenOptions) Set jDefault = jSonArgs(defaultOptions) ' now we combine them If Not jDefault Is Nothing Then Set jExtended = jDefault Else Set jExtended = New cJobject jExtended.init Nothing End If ' now we merge that with whatever was given If Not jGiven Is Nothing Then jExtended.merge jGiven End If ' and its over Set optionsExtend = jExtended
A few more methods for cJobject to be able to tweak existing cJObjects
Public Function merge(mergeThisIntoMe As cJobject) As cJobject ' merge this cjobject with another ' items in merged with are replaced with items in Me Dim cj As cJobject, p As cJobject Set p = Me.find(mergeThisIntoMe.fullKey(False)) If p Is Nothing Then ' i dont have it yet Set p = Me.append(mergeThisIntoMe) Else ' actually i do have it already If p.isArrayRoot Then ' but its an array - i need to get rid of it Set p = p.remove Set p = p.append(mergeThisIntoMe) Else p.value = mergeThisIntoMe.value End If End If ' now the other childreb tio merge in For Each cj In mergeThisIntoMe.children p.merge cj Next cj Set merge = Me End Function Public Function remove() As cJobject ' removes a branch Dim cj As cJobject, p As cJobject, i As Long Debug.Assert Not parent Is Nothing Debug.Assert parent.hasChildren parent.children.remove childIndex ' fix the childindices i = 0 For Each cj In parent.children i = i + 1 cj.childIndex = i Next cj Set remove = parent End Function Public Function append(appendThisToMe As cJobject) As cJobject ' append another object to me Dim cj As cJobject, p As cJobject If Not appendThisToMe.parent.isArrayRoot Then Set p = Me.add(appendThisToMe.key, appendThisToMe.value) Else Set p = Me.add(, appendThisToMe.value) End If If appendThisToMe.isArrayRoot Then p.addArray For Each cj In appendThisToMe.children p.append cj Next cj Set append = Me End Function
And that’s it – happy melting. You can download all this at Downloads in the cDataSet.xlsm.
For more fooling around with jSon in VBA, take a look at jSon
Continue reading about Rest to Excel Library here