How do you implement IEnumerable<T>
in C#?
What is an IEnumerable?
IEnumerable
is an interface that allows an object to be, well, enumerated. That is, you can enumerate or itemize, "count off", or otherwise list items within the object. This obviously applies to collections, or any object that contains other objects (some might even call those "containers").
IEnumerable vs IEnumerable
Let's get this out of the way first. There is a generic and non-generic form of IEnumerable
. The non-generic IEnumerable
can only enumerate objects
. C# 1 didn't have generics, so that was the only option.
C# 2 introduced generics and with it, the IEnumerable<T>
. IEnumerable<T>
enumerates, unsurprisingly, things of type T
, and it also implements IEnumerable
. This is great because it gives you more type safety than a simple IEnumerable
, where you would have to cast every enumerated item back to what you think it should be.
The interface
Alright, so what is the IEnumerable
/IEnumerable<T>
interface? There is only one method for each:
IEnumerator IEnumerable.GetEnumerator()
.IEnumerator<T> IEnumerable<T>.GetEnumerator()
.
Okay, so what's an IEnumerator
/IEnumerator<T>
? It's what actually lets you enumerate the elements of an IEnumerable
. An enumerable can be enumerated by an enumerator.
What?
Let's look at a concrete example:
That's a pretty common pattern that most C# programmers (probably) use on a daily basis. What does this really look like? It's basically syntactic sugar for
(using
just means Dispose
gets called on the thing in parenthesis when it goes out of scope.)
So GetEnumerator
is an object that implements the IEnumerator
interface, which is
public bool MoveNext()
: Move to the next item and returnfalse
if there are no more items to enumerate, ortrue
if moving to the next item was successful.public void Reset
: Reset the enumerator to the first item. This isn't always supported and is allowed to throw.T Current { get; }
: A property that refers to the current item in the enumerator.
Note that, in the code snippet above, the first thing we did was call MoveNext
. That means that the IEnumerator<T>
starts in a sort of "before the first one" state. So GetEnumerator().Current
is never valid because you must always call MoveNext
to get to the first item in the enumerable.
A naïve list
That was a bit abstract. Let's look at a solid example of how we might implement List<T>
's IEnumerator<T>
:
Note that it starts out with m_Index = -1
. This is because it is expected that you call MoveNext()
to get to the first element (where m_Index = 0
). This makes sense, because if we have an empty collection, then we want the first call to MoveNext()
to return false so we never enter our while..loop
at all.
No length
You'll notice the IEnumerator
interface has nothing to say about the number of items in the enumerable. There's no Count
or Length
property, for instance. There are interfaces that support this (e.g., IReadOnlyCollection<T>
), but the IEnumerable<T>
can only take you to each item; it can't tell you a priori how many items it will enumerate.
If it walks like a duck and talks like a duck..
If you know something about interfaces and how they relate to boxing allocations, you might be concerned about a statement like
being transformed to
Because GetEnumerator
either returns a new reference type (creating garbage) or the struct
it returns gets boxed into an IEnumerator<T>
(creating garbage). In early C#, this is what happended. That's bad.
Duck typing allows the compiler to say "if this thing looks like an IEnumerable, that is, it has a method called GetEnumerator
that looks like an IEnumerator
, then I'll use it like one instead of boxing it to an IEnumerator
". This is a rare case of "duck typing" in C#, where, if the object looks like an enumerable, and talks like an enumerable, then it can be used as one even if it doesn't implement IEnumerable
.
Example:
Now I should be able to write
Notice we didn't have to actually implement IEnumerable
or IEnumerator
anywhere. The compiler sees the foreach
and looks for an IEnumeratble
-like object, namely, can I call GetEnumerator()
on it and get back an object on which I can then call MoveNext()
, Current
, Reset()
, and Dispose()
? If I can, then, great: I can compile this.
Notice our Foo.Enumerator
is a struct
. That means the compiler doesn't need to make any GC allocations at all when we do a foreach
. In fact, a foreach
is more like if we wrote this:
That's very similar to what I wrote before, but this time, instead of enumerator
being an IEnumerator<T>
, it's just var
. So if the result of GetEnumerator()
is a struct, it doesn't get boxed into an IEnumerator<T>
; it just gets used as-is.
foreach
is Okay again
There was a long time where it was considered best-practice in Unity to always avoid foreach
statements, because they always boxed the IEnumerator
(and maybe the IDispoable
at the end). I get the sense that a lot of veterans still avoid foreach
within Unity due to its past history of creating GC garbage.
That's a shame, because foreach
can prevent one of the most common bugs in programming: index-out-of-bounds bugs. After all, if there's no index, there's no out of bounds.
But, List<T>
implements IEnumerator<T>
, so won't its GetEnumerator
return a GC'd reference type? Nope!
List<T>.GetEnumerator()
returns a specialized struct for exactly this reason. So foreach
on a List
doesn't allocate for two reasons:
List<T>.Enumerator
is a struct (so creating one doesn't allocate)- C# employs duck-typing (so structs don't get boxed)
The real implementation of List<T>
looks more like this (skipping some input validation for simplicity):
So, finally, we can see how this can work with foreach
in a non-allocating way. When we write
The compiler generates
So no GC allocations, no boxing; it's the same as if you just wrote a basic for..loop
, but safer.
Explicit interface implementation
But, the real List<T>
actually implements the IEnumerator<T>
interface. How does that work? We have
and whenever you implement an interface in C#, all its members must be public, so the GetEnumerator()
from the IEnumerator<T>
interface would conflict with our custom GetEnumerator()
that returns a custom struct, right? Right?
Enter explicit interface implementation. Interface methods do not have to be public
, if they are "explicit". So the real List<T>
looks more like this:
Now, the List<T>
can have it both ways: it implements the IEnumerator<T>
interface, which means two things:
- You understand its intended use, i.e., that it has things that can be enumerated
- You can get an
IEnumerator<T>
if you really need to
But, it also implements the duck-typed version of GetEnumerator
, so it can be used in a foreach
statement without creating garbage. foreach
will call the public GetEnumerator
method that returns a struct, but if you need to treat it as an IEnumerator<T>
, then you can invoke that interface method too.
Explicit interface implementations are neat; I use them when I need to be able to call a method on a generic T
, but I don't want that method available on a concrete type. That may sound a bit weird, but consider this case:
I have a reference counted object with retain/release semantics (looking at you, Objective-C), so I have Retain
and Release
methods on that object. It is common for me to "retain" such an object while I use it, then "release" it when I am done. E.g.,
Wouldn't it be nice if I could just wrap that in a using
? That is, I want to write
But to do this, we have to implement IDisposable
. So maybe my object looks like this:
Okay, cool, so now I can Retain
and Release
it but also use it in a using
statement. That's convenient, but I don't really want Dispose()
to be part of the public interface. "Release" just means we've decremented the reference count; the object may or may not be alive. "Dispose" usually means you shouldn't use the object afterwards because it has been, well, disposed. But if Dispose
and Release
are the same thing, then it's weird because Dispose()
doesn't necessarily dispose, and it's the same as Release()
, and now I'm confused.
What we really want to do is hide Dispose
but implement it for the convenience of the using
statement.
With explicit interface implementations, we can make Dispose()
private:
Now, we can't call Foo.Dispose()
directly, but if we have an IDisposable
, and it happens to be a Foo
, we can still call Dispose
on it. Not talking boxing here, either. Consider this:
We can call DisposeIt
on our Foo
because our Foo
is IDisposable
, and our method DisposeIt
doesn't "know" whether bar
's Dispose()
method is private or not; it just has one because it implements an IDisposable
. T
can even be a struct -- there's no boxing conversion necessary.
How to Enumerate
At Unity, I find myself implementing lots of types that can be enumerated. You might even say they are enumerable. The canonical form I use is something like this
As an added bonus, the Current
getter probably calls some indexer method on the FooCollection
, but since you probably know the length already, you can call a fast-path method that avoids any runtime out-of-bounds checks. In some cases this can even be faster than a for...loop
using a regular int
indexer.
You can also get more specific. If your collection does know the number of items in it (many do), then you should at least implement IReadOnlyCollection<T>
, which is just an IEnumerable<T>
with a Count
property. There's also ICollection<T>
, IList<T>
, IDictionary<TKey, TValue>
, and so on.
NativeArray
Unity's NativeArray<T>
is a good example of this. The NativeArray
is itself a struct, it implements IEnumerable<T>
, and it does the duck-typing thing with a custom IEnumerator<T>
.