TheSharperDev

Educating about C# and F#

The Cost of Abstracting a List in C#

 Photo by JOHN TOWNER on Unsplash

In software development, there’s a tradeoff between design and optimization. As a general rule of thumb, most abstractions come at a cost.

One abstraction that I’ve never delved into is collection return type. What are the performance considerations if your method returns a List<T>, IList<T>, IEnumerable<T>, or ICollection<T>?

Collection Guidelines

Microsoft contains a handy reference page with some guidelines for collection usage. I strongly recommend reading the article, it contains a lot of relevant information for the C# developer. Below are some general purpose guidelines:

  1. In public APIs, do not use weakly typed collections, ArrayList, List<T>, Hashtable, or Dictionary<TKey, TValue>.
  2. Most members taking collections as parameters use the IEnumerable<T> interface.
  3. Generally use Collection<T> or ReadOnlyCollection<T> as return types.
  4. Do prefer collections over arrays.

Good things to keep in mind when designing APIs.

Also good to note, when they say APIs here, they mean class level APIs, usually defined through interfaces, not REST or SOAP APIs.

Benchmarking Collections

To get an idea of what a collection abstraction costs, I’ve come up with benchmarks like the following.

I have one method that returns a collection and another method that iterates through that collection. This example uses a List and returns a List. So this is my base case.

In this benchmark, I use a List again but return a IList. The return type abstracts the actual type. This will help me determine the cost of abstracting that list.

I have other benchmarks called ListAsIEnumerable, ListAsICollection and finally CollectionAsCollection to see how a List and Collection compare. They’re all the same but with a different return type. You can see more on the github project.

Then Number is just a simple object that holds a number.

I wanted to make sure there wouldn’t be any boxing that might throw off my results.

Running this benchmark outputs the following results:

|                 Method |    N |      Mean | Ratio | Allocated |
|----------------------- |----- |----------:|------:|----------:|
|             ListAsList |  100 |  1.186 us |  1.00 |   4.49 KB |
|            ListAsIList |  100 |  1.869 us |  1.58 |   4.53 KB |
|      ListAsIEnumerable |  100 |  1.893 us |  1.60 |   4.53 KB |
|      ListAsICollection |  100 |  1.868 us |  1.62 |   4.53 KB |
| CollectionAsCollection |  100 |  2.694 us |  2.27 |   4.56 KB |
|                        |      |           |       |           |
|             ListAsList | 1000 | 10.915 us |  1.00 |  39.66 KB |
|            ListAsIList | 1000 | 16.502 us |  1.51 |   39.7 KB |
|      ListAsIEnumerable | 1000 | 16.436 us |  1.49 |   39.7 KB |
|      ListAsICollection | 1000 | 15.973 us |  1.45 |   39.7 KB |
| CollectionAsCollection | 1000 | 24.831 us |  2.28 |  39.73 KB |

You can see that memory usage is very similar across the board. Speed wise things differentiate a bit. ListAsList is the fastest. ListAs another higher collection all incur roughly the same speed penalty, about 1.5 times slower than a regular List. Then using a Collection is the slowest overall, around ~2.25 slower than the regular List.

Yes a difference in performance, but I wouldn’t expect most applications would feel that impact.

Converting Back to List

I then thought that often times when dealing with a non list collection, at some point I call ToList() on it because I want a List. What kind of penalty does that incur?

So I created a series of benchmark like the following:

I used the method defined in the previous benchmark to return a List abstracted away by another type, in this case an IList. Then I simple call ToList() on it and iterate through the list. I have similar benchmarks for turning a Collection, IEnumerable and ICollection into a List.

Benchmarking that resulted in the following stats:

|            Method |    N |      Mean | Ratio | Allocated |
|------------------ |----- |----------:|------:|----------:|
|        ListAsList |  100 |  1.173 us |  1.00 |   4.49 KB |
|  CollectionToList |  100 |  2.326 us |  2.02 |   5.37 KB |
|       IListToList |  100 |  1.453 us |  1.24 |   5.34 KB |
| IEnumerableToList |  100 |  1.426 us |  1.22 |   5.34 KB |
| ICollectionToList |  100 |  1.396 us |  1.19 |   5.34 KB |
|                   |      |           |       |           |
|        ListAsList | 1000 | 10.540 us |  1.00 |  39.66 KB |
|  CollectionToList | 1000 | 19.156 us |  1.82 |  47.56 KB |
|       IListToList | 1000 | 11.256 us |  1.07 |  47.53 KB |
| IEnumerableToList | 1000 | 11.290 us |  1.07 |  47.53 KB |
| ICollectionToList | 1000 | 11.369 us |  1.08 |  47.53 KB |

A pretty similar story to the first benchmark. Speed and memory degraded a bit for the abstracted collections and Collection is again the worst.

Final Thoughts

I would say that using a more abstract collection type when defining public APIs wouldn’t impact performance of most applications or libraries.

That result surprised me a bit, I thought there would be a bigger performance hit for using abstractions. But that’s really the value in benchmarking. It allows you to actually check things which you “think” or “feel” might be true.

Till next post.

Github Repo