TheSharperDev

Educating about C# and F#

Benchmarking For Loop Variants

Photo by Pigoff PhotographY on Unsplash

I read Thomas Levesque’s recent post about Using foreach with an index. I’ve definitely run into the problem where I wish I had the index handy when using a foreach loop.

One person commented on the article asking how it’s performance compared to a normal for loop. I decided to create a simple BenchmarkDotNet benchmark to answer that question.

For Loop Variants

I’ll be benchmarking the following for loop variants:

  1. The basic for loop
  2. The foreach loop
  3. Thomas’s foreach with index loop

Array Benchmark

I’ll attempt to keep the logic inside the loop as simple as possible so that the only thing impacting the benchmark is the difference in the for loop.

Running this benchmark gives me the following results:

|         Method |     N |          Mean | Ratio | Allocated |
|--------------- |------ |--------------:|------:|----------:|
| RegularForLoop |   100 |      82.21 ns |  1.00 |         - |
|    ForEachLoop |   100 |      69.27 ns |  0.85 |         - |
|  WithIndexLoop |   100 |   1,654.96 ns | 19.85 |     112 B |
|                |       |               |       |           |
| RegularForLoop |  1000 |     774.26 ns |  1.00 |         - |
|    ForEachLoop |  1000 |     629.62 ns |  0.80 |         - |
|  WithIndexLoop |  1000 |  15,554.59 ns | 20.14 |     112 B |
|                |       |               |       |           |
| RegularForLoop | 10000 |   7,647.67 ns |  1.00 |         - |
|    ForEachLoop | 10000 |   6,434.82 ns |  0.83 |         - |
|  WithIndexLoop | 10000 | 153,590.85 ns | 20.12 |     112 B |

That is a much bigger different that I had anticipated. I ran this a couple time and consistently got results around 20 times slower using the WithIndex extension method.

List Benchmark

Next I tried converting that benchmark to use lists to see if the problem still existed.

Definitely an improvement, but still around 14 times slower.

|             Method |     N |         Mean | Ratio | Allocated |
|------------------- |------ |-------------:|------:|----------:|
| RegularForLoopList |   100 |     118.6 ns |  1.00 |         - |
|    ForEachLoopList |   100 |     233.8 ns |  1.97 |         - |
|  WithIndexLoopList |   100 |   1,646.6 ns | 13.98 |     120 B |
|                    |       |              |       |           |
| RegularForLoopList |  1000 |   1,128.7 ns |  1.00 |         - |
|    ForEachLoopList |  1000 |   2,219.0 ns |  1.96 |         - |
|  WithIndexLoopList |  1000 |  15,762.2 ns | 13.93 |     120 B |
|                    |       |              |       |           |
| RegularForLoopList | 10000 |  11,110.6 ns |  1.00 |         - |
|    ForEachLoopList | 10000 |  21,592.9 ns |  1.96 |         - |
|  WithIndexLoopList | 10000 | 153,726.2 ns | 14.23 |     120 B |

Input from Thomas

After I published this article, I commented on Thomas’s blog post telling him my results then went to bed.

The next day I woke up to a PR from Thomas on my benchmark library where he helped fix an issue with my benchmark and tried out some more for loop variants.

In total he ended up writing 6 for loop variants, check out the whole list here. This was the most performant one that he came up with for arrays.

Here are the stats for this variant compared against the normal for loop.

|                               Method |     N |         Mean | Ratio | Allocated |
|------------------------------------- |------ |-------------:|------:|----------:|
|                       RegularForLoop |   100 |     85.50 ns |  1.00 |         - |
| WithIndexSpecificEnumeratorLoopArray |   100 |    738.91 ns |  8.62 |         - |
|                                      |       |              |       |           |
|                       RegularForLoop |  1000 |    775.28 ns |  1.00 |         - |
| WithIndexSpecificEnumeratorLoopArray |  1000 |  7,460.11 ns |  9.54 |         - |
|                                      |       |              |       |           |
|                       RegularForLoop | 10000 |  7,538.59 ns |  1.00 |         - |
| WithIndexSpecificEnumeratorLoopArray | 10000 | 73,774.65 ns |  9.76 |         - |

Which is about 10 times slower than a normal for loop now, but definitely better than 20 times.

Recommendation

While Thomas’s extension method solves the problem of the object and the index, it comes with a pretty big performance hit. If you need both the object and the index, I would recommend using a regular for loop.

As always, you’ll never know performance until you benchmark it. BenchmarkDotNet makes that super easy to do, so definitely something I’m going to continue doing and would recommend that you do to.

Github Benchmarking Repo