Background
In my current previous full time job, majority of the tasks we have are concerning about string manipulation.
The applications that we are creating and maintaining are tools use by our journals team.
Our clients are multinational publishing companies that are producers of online journals, books and research resources.
We automate journals analysts tasks related to XML, HTML formattings etc. Most of our business logic tasks involve string manipulation, we manipulate a lot of text. In this article I want to discuss the things I learned on string concatenation.
The Problem
When you work with a lot of text/string manipulation, a slight mistake will take a huge load to your application. Take this example:
string myString = string.Empty;
for (int i=0; i<9999999; i++)
{
myString += "text ";
}
First take away here is that string is immutable, meaning they can’t be changed after they’ve been created.
The concatenation in our example is creating a new string every loop. So in our example, this is equivalent to myString = myString + "text ";
Allocating huge memory as the iteration increases.
eg.
loop1: string1= "text ";
loop2: string2= string1 + "text ";
loop3: string3= string2 + "text ";
... and so on ..
As you can see, this is slow and inefficient.
The Solution: We use StringBuilder.
var stringBuilder = new StringBuilder();
for (int i=0; i<9999999; i++)
{
stringBuilder.Append("text ");
}
Using the StringBuilder class, it is fast and in my opinion is much readable than our first example. It maintains a mutable structure of buffer, and append new strings as necessary. When the buffer is full, the StringBuilder class doubles its buffer size, so you don’t have to worry about it.
It is also worth noting that not all scenarios you will use StringBuilder to ‘optimize’ your concatenations dilemma. Take this example:
string firstName = "John";
string lastName = "Smith";
string fullName = firstName + " " + lastName;
Some developers will try to optimize this code by doing:
string firstName = "John";
string lastName = "Smith";
var stringBuilder = new StringBuilder();
stringBuilder.Append(firstName);
stringBuilder.Append(" ");
stringBuilder.Append(lastName);
string fullName = stringBuilder.ToString();
This is a bad usage of StringBuilder class, the 2nd code snippet is efficient compare to the 1st example. But this is very tiny small performance gain, unless this code block is used/called large amount of times. Small performance gain in exchange to readability is almost always a No, No.
In the first example: string fullName = firstName + " " + lastName;
The compiler converts this in runtime as string fullName = String.Concat(firstName, " ", lastName);
So when to use StringBuilder class?
My opinion is that you always keep things simple, until you have a valid reason to make it complex. For something like 3-8 elements/strings, there is no point in using StringBuilder(), unless you want to use it repeatedly or continously. String.Concat or syntax “+=” is the way to go. Unless you’re looking at a much higher number of strings, do not sacrifice readability for micro performance gain.