How to create a tag cloud

What can you learn here ?
  • Make a tag cloud
  • cJobject as 'on demand'
  • the cTagCloud class

Making a tag Cloud in Excel get it now

Although most of the discussion about How to use cJobject is focused on being able to make and read jSon in Excel, cJobject can also be used as an 'on demand' object with properties, heirarchy and structure. Here is an example of how to implement a tag cloud class and how cJobject helps this.

What to download

All the examples contain all the classes needed for them to work and all projects can be found here. You can also create A tagCloud in Outlook

What is a tag cloud

This is a way of visualizing the most common terms in a set of data, simply by changing the font size depending on how often a particular term appears. 

Here is an example of a tag cloud based on a sample of 100 tweets on the topic of "Mitt Romney" made by the Rest to Excel library, and created by the cTagCloud class.


cTagcloud also will do a colorful version as below taken at a later time, using the heatmap scale referenced in  Create a heatmap in Excel


It's pretty easy to see from this layout what the main topic is, especially if you remove the noise - which the cTagCloud class will do for you too. Here is the code for creating the above using the cTagCloud class, Data Manipulation Classes and How to use cJobject. Assuming you have the cDataSet.xlsm workbook downloaded, that's all there is to it. 

Public Sub testTag()
    Dim ds As cDataSet, dr As cDataRow, tg As cTagCloud
    Dim r As Range, jo As cJobject
    Set ds = New cDataSet
    Set tg = New cTagCloud
    tg.init , , 3, " "
    With ds
        .populateData wholeSheet("tweetsentimentdetails"), , , , , , True
        For Each dr In .Rows
            tg.collect dr.Cell("text").toString
        Next dr
    End With
    tg.results Sheets("tagout").Range("a1")

End Sub



Walkthrough

Initialize the tagCloud - minimum number of mentions is 3 to be included at all, and terms are separated by a space.
 tg.init , , 3, " "

Here are the defaults for the .init method
Public Function init(Optional sName As String = "tagcloud", _
            Optional bDump As Boolean = True, _
            Optional lMinCountToShow As Long = 1, _
            Optional sep As String = ",", _
            Optional sNoiseString As String = vbNullString, _
            Optional bColorful As Boolean = True, _
            Optional dSmallest As Double = 8, _
            Optional dBiggest As Double = 40) As cTagCloud

Get the input dataset from the tweetsentiment details sheet, and collect all the data in the "text" column for tag cloud processing.
    With ds
        .populateData wholeSheet("tweetsentimentdetails"), , , , , , True
        For Each dr In .Rows
            tg.collect dr.Cell("text").toString
        Next dr
    End With

Output the results to cell A1 on the tagout worksheet and adjust the relative font sizes, taking the default largest and smallest font to allow.
tg.results Sheets("tagout").Range("a1")

Here is the call for  for the .results method
Public Function results(rOut As Range) As cTagCloud

Using cJobject as an 'on demand' class

Inside the tagcloud class, cJobject is used to store properties about each tag, with these properties.
?jo.Serialize (True)
  {
   "mitt":{
      "count":"80",
      "scale":"1",
      "size":"40",
      "show":"True",
      "color":"255"
   }
 }

Aside from not needing to bother creating another class, or use arrays, or multiple collections or custom types, being able to directly serialize the data for each item makes debugging really simple!

The cTagCloud class code

You can find this and the example in the cDataset.xlsm workbook.
Option Explicit
Private pJob As cJobject
Private pDumpNoise As Boolean
Private pMinCountToShow As Long
Private pSep As String
Private pNoise As Collection
Private pNoiseString As String
Private pColorful As Boolean
Private pBiggest As Double
Private pSmallest As Double

Public Property Get tagJob() As cJobject
    Set tagJob = pJob
End Property
Public Function init(Optional sName As String = "tagcloud", _
            Optional bDump As Boolean = True, _
            Optional lMinCountToShow As Long = 1, _
            Optional sep As String = ",", _
            Optional sNoiseString As String = vbNullString, _
            Optional bColorful As Boolean = True, _
            Optional dSmallest As Double = 8, _
            Optional dBiggest As Double = 40) As cTagCloud
    Dim a As Variant, i As Long, k As String
    pBiggest = dBiggest
    pSmallest = dSmallest
    Set pJob = New cJobject
    With pJob
        .init Nothing, sName
    End With
    pColorful = bColorful
    pDumpNoise = bDump
    pMinCountToShow = lMinCountToShow
    pSep = sep
    If pDumpNoise Then
        If sNoiseString = vbNullString Then
            pNoiseString = _
                "and" & "," & "the" & "," & "a" & "," & "of" & "," & _
                "be" & "," & "is" & "," & "for" & "," & "on" & "," & _
                "to" & "," & "in" & "," & "i" & "," & "where" & "," & _
                "when" & "," & "this" & "," & "that" & "," & "can" & _
                "," & "how" & "," & "with" & "," & "so" & "," & "it" & _
                "," & "got" & "," & "get" & "," & _
                "so" & "," & "my" & "," & "me" & "," & "if" & "," & "had" _
                & "," & "no" & "," & "or" & "," & "im" & "," & "do" & "," _
                & "did" & "," & "has" & "," & "have" & "," & _
                "will" & "," & "her" & "," & "him" & "," & "his" & "," _
                & "its" & "," & "now" & "," & "then" & "," & "by" & "," & "at" _
                & "," & "an" & "," & "not" & "," & _
                "but" & "," & "are" & "," & "us" & "," & "on" & "," & _
                "a" & "," & "so" & "," & "was" & "," & "an" & "," & "we" & "," & _
                "you" & "," & "as"
        Else
            pNoiseString = sNoiseString
        End If
        Set pNoise = New Collection
        a = Split(pNoiseString, ",")
        For i = LBound(a) To UBound(a)
            k = LCase(Trim(CStr(a(i))))
            If Not isNoise(k) Then pNoise.add k, k
        Next i
    End If
    Set init = Me
End Function
Public Function increment(sTerm As String, Optional amount As Long = 1) As cJobject
    Dim jo As cJobject
    With tagJob
        Set jo = .ChildExists(sTerm)
        If jo Is Nothing Then
            Set jo = .add(sTerm)
            With jo
                .add "count", 0
                .add "scale", 0.1
                .add "size", 0.1
                .add "show", False
                .add "color", vbBlack
            End With
        End If
        With jo.Child("count")
            .Value = .Value + 1
        End With
    End With
    Set increment = jo
End Function
Public Function collect(sIn As String) As cTagCloud
    Dim a As Variant, i As Long, s As String
    a = Split(sIn, pSep)
    For i = LBound(a) To UBound(a)
        s = cleanNoise(CStr(a(i)))
        If s <> vbNullString Then increment CStr(s)
    Next i
    Set collect = Me
End Function
Private Function cleanNoise(sIn As String) As String
    Dim s As String
    s = LCase(Trim(sIn))
    If pDumpNoise Then
        s = rxReplace("singlespace", s, "$1")
        s = rxReplace("nonprintable", s, vbNullString)
        s = rxReplace("punctuation", s, vbNullString)
        If isNoise(s) Then s = vbNullString
    End If
    cleanNoise = s
End Function
Public Function getSize() As cTagCloud
    Dim jo As cJobject, vSmall As Variant, vBig As Variant, v As Variant

    ' find biggest and smallest
    For Each jo In pJob.Children
        With jo
            v = .Child("count").Value
            If v >= pMinCountToShow Then
                .Child("show").Value = True
                If v < vSmall Or IsEmpty(vSmall) Then vSmall = v
                If v > vBig Or IsEmpty(vBig) Then vBig = v
            End If
        End With
    Next jo
    ' calculate size scale
    For Each jo In pJob.Children
        With jo
            If .Child("count").Value >= pMinCountToShow Then
                If vBig > vSmall Then
                    .Child("scale").Value = (.Child("count").Value - vSmall) / (vBig - vSmall)
                Else
                    .Child("scale").Value = 1
                End If
                .Child("size").Value = .Child("scale").Value * (pBiggest - pSmallest) + pSmallest
                If pColorful Then
                    .Child("color").Value = heatMapColor(vSmall, vBig, .Child("count").Value)
                End If
            End If
        End With
    Next jo
    
    Set getSize = Me
End Function
Public Function results(rOut As Range) As cTagCloud
    Dim k As Long, jo As cJobject, s As String
    
    getSize
    Application.ScreenUpdating = False
    With rOut
        s = vbNullString
        .HorizontalAlignment = xlGeneral
        .VerticalAlignment = xlBottom
        .WrapText = True
        k = 1

        For Each jo In tagJob.Children
            If jo.Child("show").Value Then
                If s <> vbNullString Then s = s & pSep
                s = s & jo.key
            End If
        Next jo
        .FormulaR1C1 = s
       
        For Each jo In tagJob.Children
            If jo.Child("show").Value Then
                With .Characters(Start:=k, Length:=Len(jo.key) + 1).font
                    .Size = jo.Child("size").Value
                    .color = jo.Child("color").Value
                End With
                k = Len(jo.key) + 1 + k
            End If
        Next jo

    End With
    Application.ScreenUpdating = True
    Set results = Me
 End Function
Private Function isNoise(sid As String) As Boolean
    Dim s As String
    On Error GoTo handle
    s = pNoise(sid)
    isNoise = True
    Exit Function
handle:
    isNoise = False
End Function



Comments