One of the first pieces of efficiency advice most .NET developers learn       is "use StringBuilder to concatenate strings". A little bit like       "exceptions are expensive" this is a       misunderstood piece of received wisdom. (Fortunately it's not nearly as harmful       as the exception performance myth, but it comes up about as often...)     
Before reading the rest of this page, you should be aware of        the basics of the String type.        For the sake of readability, I'll stick to "string" rather than "String"       or "string" from here onwards. Please let me        know if you find this confusing, and I'll change it.     
I've included this in my list of general framework articles rather than in the       list of C#-specific articles because I suspect every language targetting .NET is likely       to use the same way of concatenating strings under the hood.     
The Problem We're Trying To Avoid
There is a very real problem that this wisdom correctly addresses - that of building a load of strings (usually of increasing size) which are never used apart from contributing to the creation of other strings. Here's an example:| using System; | 
x += "!";       is absolutely equivalent to x = x+"!";. The concatenation here is creating an entirely       new string, allocating enough memory for everything, copying all the data from the existing value       of x and then copying the data from the string being appended ("!"). As the string       grows, the amount of data it has to copy each time grows too, which is why the time taken didn't       just double when I doubled the number of iterations.     This is clearly inefficient. If someone asked you to add something to a shopping list, you wouldn't write a new copy of the shopping list first, would you? Enter
StringBuilder...     The StringBuilder Solution
Here is an equivalent (in terms of the final value of x) program, which is much, much       faster:     | using System; | 
StringBuilder maintains an internal buffer and appends to that,       only copying its buffer when there isn't room for any more data. (In fact, the internal buffer       is just a string - strings are immutable from a public interface perspective, but not from       within the mscorlib assembly.) We could make the above code even more efficient       by passing the final size of the string (which we happen to know in this case) to the        constructor of StringBuilder to make it use a buffer of the right size to        start with - then there'd be no unnecessary copying at all. Unless you're in a situation       where you have that information readily to hand though, it's usually not worth worrying about -       StringBuilder doubles its buffer size when it runs out of room, so it doesn't       end up copying the data very many times anyway.     So I Should Use StringBuilder Everywhere, Right?
No, quite simply. The above is an explanation of why the received wisdom of "use       StringBuilder for concatenation" is right some of the time. However,       many people take it at face value without understanding the reasoning behind it. They start       turning code like this:     | string name = firstName + " " + lastName; | 
| // Bad code! Do not use! | 
However, the second version is actually less efficient than the first! Not much less efficient - and if the second version were more readable, I'd go with it for the reasons above - but when the entire point of using
StringBuilder is to improve efficiency,       using the above is just plain nuts.     The first version (assuming that
firstName and lastName are real       variables, and not constants - I'll come onto that later) compiles to a call to        String.Concat, like this:     | string name = String.Concat (firstName, " ", lastName); | 
String.Concat takes a bunch of strings (or objects) and concatenates them together, plain and simple.       There are various overloads - some take strings, some take objects (which are just converted into strings), some       take arrays of objects or arrays of strings. They all do the same thing though. Now, String.Concat       can work out the lengths of all the strings involved before it concatenates them together (at least if you pass       it strings - if you pass it objects, it needs to create temporary strings and then concatenate those       together). This means that no extra copying is involved - the data is copied once into the new string, which is       of exactly the right length.     Compare this with the
StringBuilder version. It doesn't know at construction time how       big to make the buffer (because we haven't told it - doing so would make the code even less readable). That       means it may have to copy the buffer, and it's likely to end up with a buffer which is actually larger than       it needs to be. Oh, and there's the overhead of an extra object (the StringBuilder itself). Remind       me why this was meant to be a good idea?     The important difference between this example and the previous one is that we can easily present all the strings which need to be concatenated together in one call to
String.Concat. That means that no       intermediate strings are needed. StringBuilder is efficient in the first example because it acts        as a container for the intermediate result without having to copy that result each time - when there's no       intermediate result anyway, it has no advantage.     Constants
Things get even crazier when it comes to string constants (literals,const string members). What       do you suppose string x = "hello" + " " + "there"; is compiled to? It would be reasonable to expect it        to be another call to String.Concat - but it isn't. It's actually compiled to the exact same code as       string x = "hello there";. The compiler knows that all the parts are constant, so it does all the       concatenation at compile time, storing the full string in the compiled code. Converting that to use       StringBuilder is inefficient in both memory and speed, as well as reducing readability.     Rules Of Thumb
So, when should you useStringBuilder, and when should you use the string concatenation operators?     -          Definitely use StringBuilderwhen you're concatenating in a non-trivial loop - especially if you don't know for sure (at compile time) how many iterations you'll make through the loop. For example, reading a file a character at a time, building up a string as you go using the+=operator is potentially performance suicide.
-          Definitely use the concatenation operator when you can (readably) specify everything which needs to         be concatenated in one statement. (If you have an array of things to concatenate, consider calling         String.Concatexplicitly - orString.Joinif you need a delimiter.)
- Don't be afraid to break literals up into several concatenated bits - the result will be the same. You can aid readability by breaking a long literal into several lines, for instance, with no harm to performance.
-          If you need the intermediate results of the concatenation for something other than feeding         the next iteration of concatenation, StringBuilderisn't going to help you. For instance, if you build up a full name from a first name and a last name, and then add a third piece of information (the nickname, maybe) to the end, you'll only benefit from usingStringBuilderif you don't need the (first name + last name) string for other purpose (as we do in the example which creates aPersonobject).
- If you just have a few concatenations to do, and you really want to do them in separate statements, it doesn't really matter which way you go. Which way is more efficient will depend on the number of concatenations the sizes of string involved, and what order they're concatenated in. If you really believe that piece of code to be a performance bottleneck, profile or benchmark it both ways.
 
 
0 Comments Received
Leave A Reply