Thursday, November 3, 2011

My favorite C# micro-optimization #1: Buffer.BlockCopy()

So a caveat first-off. Micro-optimizations are evil. You almost never need them. You can live the vast majority of your life as a programmer and never use them; and most of the time you use them, you’re going to use them incorrectly. If your program is running slow, you’ll almost never find some tiny little micro-optimization that will fix it.

But there are times when they’re handy, and with all the work I’ve been doing the last couple years on a custom Silverlight media stack, I’ve had a number of opportunities to learn about how to speed up code that gets called thousands of times each second. This series of articles is the result of that experience.

And my first suggestion is to use Buffer.BlockCopy() when you need to move data in and out of arrays. You don’t normally need to do this very heavily, but when you’re dealing with real-time multimedia, you’ll find yourself doing it all the time, so the faster you can do it, the better.

I mention this because a common way to move data from one array to another is through a for loop, like so:

   1:  for (int i = 0; i < sourceArray.Length; i++)
   2:  {
   3:      destinationArray[i] = sourceArray[i];
   4:  }

Unfortunately, that’s very nearly the slowest way to do it. (Array.Clone() is actually slower, but that’s another story.) A much better way to do it is to use Array.Copy(). Indeed, this is normally the preferred method, since you don’t have to worry about the size of the individual elements, which is an easy place to mess up. Moreover, Array.Copy() works for arrays of any type, instead of just intrinsic value types.

   1:  Array.Copy(sourceArray, 0, destinationArray, 0, sourceArray.Length);

But as it turns out, Buffer.BlockCopy() is slightly faster for many operations, since it doesn’t have to perform certain checks at the beginning of the copy:

   1:  Buffer.BlockCopy(sourceArray, 0, destinationArray, 0, sourceArray.Length * sizeof(short));

Benchmark results on my machine (10 million copies of a 64-element short[] array):

Buffer.BlockCopy action completed: iteration = 5, completionTime = 420, averageCompletionTime = 422.000
Array.Copy action completed: iteration = 5, completionTime = 478, averageCompletionTime = 482.600
forLoop action completed: iteration = 5, completionTime = 1092, averageCompletionTime = 1093.000

That said, the difference between Array.Copy() and Buffer.BlockCopy() tends to disappear the larger the amount of data to copy (while a for loop falls further and further behind). If you’re copying 64,000 elements instead of 64 elements, these are the results:

Buffer.BlockCopy action completed: iteration = 5, completionTime = 1251, averageCompletionTime = 1229.000
Array.Copy action completed: iteration = 5, completionTime = 1254, averageCompletionTime = 1218.200
forLoop action completed: iteration = 5, completionTime = 11063, averageCompletionTime = 11082.200

But there’s one significant caveat: if you only need to move a few elements, even the minimal overhead of Buffer.BlockCopy() can be too much. If you need to move less than 32 elements, you’ll probably find that a straightforward for loop (or perhaps an unrolled version of it) is the fastest. Of course, you’ll want to do your own testing to find out where the break-even point is. But here are my results with 1,000,000 copies of a 16-element array:

Buffer.BlockCopy action completed: iteration = 5, completionTime = 367, averageCompletionTime = 363.800
Array.Copy action completed: iteration = 5, completionTime = 418, averageCompletionTime = 420.000
forLoop action completed: iteration = 5, completionTime = 291, averageCompletionTime = 290.000

No comments: