VBA has a garbage collector that runs from time to time in order to free up memory that is no longer being used. in Objects and the garbage collector I showed how to teardown objects so they would become eligible for garbage collection.
There’s another problem related to strings. When you concatenate a string, it creates a new copy of the concatenated versions. Consider this loop
s = vbnullstring for each c in collect s = s & c.text next c debug.print len(s) , s
Each time that s is appended to, a new version of s gets created by copying the old s and and the next text to a new area of memory. That leaves the old s hanging around, waiting for the garbage collector to arrive to free up the memory. So 2 bad things are happening
- We keep re-copying an every increasing amount of text
- The garbage collector interrupts everything to do its work according to some rule, and/or we gradually use up memory.
This chart shows 100 iterations of a process similar to the above. See how each iteration takes longer than the previous one. I ran this a few times and always got the same kind of pattern. I suspect the little spikes are the garbage collection kicking in, since they are followed by a short relief period.
This can have a very significant effect on run times if the loops are large and the strings long.
The StringBuilder class
In VB there is a StringBuilder class which allegedly deals with this problem. It doesn’t exist in VBA, so I created something similar with i’ve called the cStringChunker. Here is the result of the two methods together, with the StringChunker in red.
The cStringChunker timings – shown in red- barely register on this chart against the regular concatenation method.
Aside from being orders of magnitude faster , you can see that cStringChunker holds pretty consistent performance regardless of the string being concatenated.
You’ll see just a couple of red blips – this is when the cache it uses needs extending, and its possible that the garbage collector pays a visit.
How does it work
To make this possible, you have to be able to add characters in place to a string to encourage it not to replicate. I found the key to this in this wikibook article, from which I quote this…..
You can also use Mid$ on the left hand side of an assignment:
Dim s As String s = "abcdef" Debug.Print s Mid$(s, 2, 3) = "xxx" Debug.Print s Mid$(s, 3, 1) = "abcdefgh" Debug.Print s Mid$(s, 2, 3) = "y" Debug.Print s abcdef axxxef axaxef ayaxef
Notice that when Mid$ is on the left it doesn’t change the total length of the string it just replaces the specified number of characters. If the right hand side specifies fewer characters than are asked for, then only that number of characters is replaced; if it specifies more, only the number asked for is used.
So Mid() can be on the left. And it replaces characters in place inside a string. Therefore if you always have a string big enough to take the concatenation required, you will never need to make a new copy of it. Here is the code rewritten using the cStringChunker which is based on that principle.
dim chunk as cStringChunker set chunk = new cStringChunker for each c in collect chunk.add c.text next c debug.print chunk.size, chunk.content
A cache is maintained into which the added string is inserted. If the cache needs to be extended, it extends first by a modest amount, then increasing amounts depending on the current content size. That means it only very occassionally needs to go off and extend itself, avoiding the garbage collector’s attention and unecessary copying, yet keeping the allocation modest for smaller string operations.
Other useful things
With such a class we can simplify some common string things, taking advantage of the efficiency of ‘not having to make a copy’. Let’s take this example – a common problem – making a comma separated list One way to do it, would be this, but as we’ve already established, it would keep generating bigger and bigger versions of s. We also have that pesky IF. And what if the first element was blank ? the result would have too few commas.
dim s as string s = vbnullstring for each c in collect if s <> vbnullstring then s = s & "," s = s & c.text next c debug.print s, len(s)
cStringChunker has a .chop(n) method, where n (default 1) characters will be discarded from the end of the string. Note that it also returns itself from the add and chop operations so that multiple methods can be added together.
dim chunk as cStringChunker set chunk = new cStringChunker for each c in collect chunk.add (c.text).add(",") next c debug.print chunk.size, chunk.chop.content
The cStringChunker class
Code is on github, and below. (to find out how to embed code in your blog like this, see Step by Step Gas Publisher)
Methods and properties
To find out how to automatically create documentation like this from your VBA modules see Automatic documentationcStringChuncker class
For more about this, see the excel liberation blog
For help and more information join our forum, follow the blog or follow me on Twitter