VBA string manipulation optimization with custom classes

Here’s a little bit more on the subject of VBA string manipulation. I showed in a post the other day how things take longer and longer when concatenating large strings in VBA. One reason I was digging into this was that for very large datasets, JSON serialization  – 500k strings and above, was taking an inexplicable amount of time. It turned out that string concatenation and the interaction with the garbage collector was at the heart of the problem. To deal with this, I introduced a new class ( cStringChunker ) which manipulates strings ‘in place’, where its possible to do so. The results were pretty impressive – here’s a 100 cycles of some complex string concatenations .. the chunk method (in red) are consistent and orders of magnitude faster than the straight string method.

However, when I came to apply this to the JSON serializer, I found that I needed to create a bunch more methods in order to avoid converting back and forwards between strings and the chunker. Before we get into that, here’s the results of improvements in the JSON serializer by using the chunker rather than regular string operations. Again – the new serialization using the chunker (in blue) is very much faster than concatenating strings.

Additional methods

I avoid back and forward string conversions by implementing common string methods such as right, left , mid and so on as well as few new useful ones, which I’ll probably add to over time. The list of methods are at the end of this post, but here’s some examples of how to use. They are all chainable for convenience.

Here’s an example of a complex string operation

 

 debug.print cs.add("quick ") _
                    .add("brown") _
                    .insert("the") _
                    .shift(4, 1) _
                    .add(" fox") _
                    .add(" gets the chicken") _
                    .add(")") _
                    .insert ("(") _
                    .content

 

Versus the string equivalent

 

  s = s & "quick "
                s = s & "brown"
                s = "the" & s
                s = left(s, 4) & " " & Mid(s, 4)
                s = s & " fox"
                s = s & " gets the chicken"
                s = "(" & s & ")"
                debug.print s

 

Both give this output
(the quick brown fox gets the chicken)

This would trim all spaces from the end of a line, trim a linefeed if it’s there, and chop a comma if it’s there.

t.chopWhile(" ")
                .chopIf(vbLf)
                .chopIf (",")

We also have getMid(), getLeft(), getRight() functions along with size and content properties.

Short strings

Since these are built in order to optimize long string manipulations, it is likely that the overhead of a custom class with shorter strings will be less efficient. Here’s the results of a test using string manipulation like the example above. As expected, the chunker is consistent, but slightly slower – but the execution times are miniscule in any case

A note about the tests

The timings are calculated by repeating the operation multiple times, and timing how long each set of  operations took. More information is in on the Excel Liberation site.

cStringChunker class methods and properties

cStringChunker code

'gistThat@mcpher.com :do not modify this line - see ramblings.mcpher.com for details: updated on 5/3/2014 6:30:52 PM : from manifest:7471153 gist https://gist.github.com/brucemcpherson/5102369/raw/cStringChunker.cls
' stringChunker class for VBA because string concat takes ages
Option Explicit
' v1.06  5102369
Private pContent As String
Private pSize As Long
' minimum amount to increment by each time
Const defaultChunkSize = 64
Public Property Get size() As Long
    ' this is how much content is real
    size = pSize
End Property
Public Property Get content() As String
    ' return the real part of the content
    If pSize > 0 Then
        content = getLeft(size)
    Else
        content = vbNullString
    End If
End Property
Public Property Get getLeft(howMany As Long) As String
    ' return the left part of the content
    ' c.getLeft(howmany) is equivalent to left(c.content,howmany), but avoids extra assignment
    getLeft = getMid(1, howMany)
End Property
Public Property Get getRight(howMany As Long) As String
    ' return the right part of the content
    ' c.getRight(howmany) is equivalent to right(c.content,howmany), but avoids extra assignment
    getRight = getMid(pSize - howMany + 1, howMany)
End Property
Public Property Get getMid(startPos As Long, Optional howMany As Long = -1) As String
    ' extract from content
    ' c.getMid(startPos,howmany) is equivalent to mid(c.content,startPos, howmany), but avoids extra assignment
    Dim n As Long
    Debug.Assert startPos > 0 And startPos <= pSize
    n = howMany
    If n = -1 Then
        n = pSize - startPos + 1
    End If
    n = minNumber(pSize - startPos + 1, n)
    If n > 0 Then
        getMid = Mid(pContent, startPos, n)
    Else
        getMid = vbNullString
    End If
End Property
Public Property Get self() As cStringChunker
    ' convenience for with in with
    Set self = Me
End Property
Public Function clear() As cStringChunker
    ' easy to clear out.. may as well keep the same buffer going
    pSize = 0
    Set clear = Me
End Function
Public Function uri(addstring As String) As cStringChunker
    Set uri = add(URLEncode(addstring))
End Function
Public Function toString() As String
    toString = content()
End Function
Public Function add(addstring As String) As cStringChunker
    Dim k As Long
    ' add some content to end
    k = Len(addstring)
    If k > 0 Then
        adjustSize (k)
    
        Mid(pContent, size + 1, k) = addstring
        pSize = size + k
    End If
    Set add = Me
End Function
Public Function addLine(addstring As String) As cStringChunker
    Set addLine = add(addstring).add(vbCrLf)
End Function
Public Function insert(Optional insertString As String = " ", _
                    Optional insertBefore As Long = 1) As cStringChunker
    'default position is at beginning, insert a space
    'c.insert("x",c.size+1) is equivalent to c.add("x")
    
    If insertBefore = pSize + 1 Then
        Set insert = add(insertString)
        
    Else
        ' 'todo .. how to handle programming errors?
        Debug.Assert insertBefore > 0 And insertBefore <= pSize
        
        ' regular string concatenation is better since there is overlap
        pContent = getLeft(insertBefore - 1) & insertString & getMid(insertBefore)
        pSize = Len(pContent)
        Set insert = Me
            
    End If
    Set insert = Me
End Function
Public Function overWrite(Optional overWriteString As String = " ", _
                    Optional overWriteAt As Long = 1) As cStringChunker
    'default position is at beginning, overwrite with a space
    Dim k As Long
    k = Len(overWriteString)
    If k > 0 Then
        ' 'todo .. how to handle programming errors?
        Debug.Assert overWriteAt >= 0
        '' we'll allow overwrite to extend past end, be greedy
        adjustSize (k)
        pSize = maxNumber(pSize, k + overWriteAt - 1)
        
        Mid(pContent, overWriteAt, k) = overWriteString
        
    End If
    Set overWrite = Me
End Function
                        
Public Function shift(Optional startPos As Long = 1, _
                Optional howManyChars As Long = 0, _
                Optional replaceWith As String = vbNullString) As cStringChunker
    ' shift by howmany chars .. negative= left, positive = right
    'TODO how to deal with programming errors? message, raise error, assert?
    Dim howMany As Long
    
    howMany = howManyChars
    If howMany = 0 Then
        howMany = Len(replaceWith)
    End If
        
    Debug.Assert howMany + startPos > 0
    Debug.Assert startPos <= pSize And startPos > 0
    
    ' make space
    If howMany <> 0 Then

        If howMany > 0 Then
        ' its a right shift, use insert
            Set shift = insert(Space(howMany), startPos)
        Else
            ' a left shift
            If startPos > 1 Then
                ' we can do an overwrite
                overWrite getMid(startPos + howMany, pSize - startPos + 1), startPos
                pSize = pSize + howMany
            End If
        
        End If
    End If
    
    Set shift = Me
End Function
Public Function chop(Optional n As Long = 1) As cStringChunker
    ' chop n charaters from end of content
    pSize = maxNumber(0, pSize - n)
    Set chop = Me
End Function
Public Function chopIf(t As String) As cStringChunker
    ' chop if its t
    Dim k As Long
    k = Len(t)
    If k <= pSize Then
        If getRight(k) = t Then
            chop (k)
        End If
    End If
    Set chopIf = Me
End Function
Public Function chopWhile(t As String) As cStringChunker
    ' chop if its t
    Dim k As Long, x As Long
    
    Set chopWhile = Me
    x = pSize
    While chopIf(t).size <> x
        x = pSize
    Wend

End Function
Private Function maxNumber(a As Long, b As Long) As Long
    If a > b Then
        maxNumber = a
    Else
        maxNumber = b
    End If
End Function
Private Function minNumber(a As Long, b As Long) As Long
    If a < b Then
        minNumber = a
    Else
        minNumber = b
    End If
End Function
Private Function adjustSize(needMore As Long) As cStringChunker
    Dim need As Long
    need = pSize + needMore
    If Len(pContent) < need Then
        pContent = pContent & Space(needMore + maxNumber(defaultChunkSize, Len(pContent)))
    End If
    Set adjustSize = Me
End Function
Private Sub class_initialize()
    pSize = 0
    pContent = Space(defaultChunkSize)
End Sub

 

For more stuff like this, visit the Excel liberation site.

About brucemcp 225 Articles
I am a Google Developer Expert and decided to investigate Google Apps Script in my spare time. The more I investigated the more content I created so this site is extremely rich. Now, in 2019, a lot of things have disappeared or don’t work anymore due to Google having retired some stuff. I am however leaving things as is and where I came across some deprecated stuff, I have indicated it. I decided to write a book about it and to also create videos to teach developers who want to learn Google Apps Script. If you find the material contained in this site useful, you can support me by buying my books and or videos.