How to IEnumerable

How do you implement IEnumerable<T> in C#?

What is an IEnumerable?

IEnumerable is an interface that allows an object to be, well, enumerated. That is, you can enumerate or itemize, "count off", or otherwise list items within the object. This obviously applies to collections, or any object that contains other objects (some might even call those "containers").

IEnumerable vs IEnumerable

Let's get this out of the way first. There is a generic and non-generic form of IEnumerable. The non-generic IEnumerable can only enumerate objects. C# 1 didn't have generics, so that was the only option.

C# 2 introduced generics and with it, the IEnumerable<T>. IEnumerable<T> enumerates, unsurprisingly, things of type T, and it also implements IEnumerable. This is great because it gives you more type safety than a simple IEnumerable, where you would have to cast every enumerated item back to what you think it should be.

The interface

Alright, so what is the IEnumerable/IEnumerable<T> interface? There is only one method for each:

Okay, so what's an IEnumerator/IEnumerator<T>? It's what actually lets you enumerate the elements of an IEnumerable. An enumerable can be enumerated by an enumerator.

What?

Let's look at a concrete example:

var foos = new List<Foo>();
...
foreach (var foo in foos)
{
// do a thing with foo
}

That's a pretty common pattern that most C# programmers (probably) use on a daily basis. What does this really look like? It's basically syntactic sugar for

using (IEnumerator<Foo> enumerator = foos.GetEnumerator())
{
while (enumerator.MoveNext())
{
var foo = enumerator.Current;
// do something with foo
}
}

(using just means Dispose gets called on the thing in parenthesis when it goes out of scope.)

So GetEnumerator is an object that implements the IEnumerator interface, which is

Note that, in the code snippet above, the first thing we did was call MoveNext. That means that the IEnumerator<T> starts in a sort of "before the first one" state. So GetEnumerator().Current is never valid because you must always call MoveNext to get to the first item in the enumerable.

A naïve list

That was a bit abstract. Let's look at a solid example of how we might implement List<T>'s IEnumerator<T>:

public class Enumerator<T>
{
int m_Index = -1;
List<T> m_List;

public Enumerator(List<T> list)
{
m_List = list;
}

public bool MoveNext()
{
m_Index++; // advance the index
return m_Index < m_List.Count; // Return true if m_Index is valid
}

public void Reset()
{
m_Index = -1; // Reset index to one before the beginning
}

public T Current => m_List[m_Index]; // Return the item at m_Index
}

Note that it starts out with m_Index = -1. This is because it is expected that you call MoveNext() to get to the first element (where m_Index = 0). This makes sense, because if we have an empty collection, then we want the first call to MoveNext() to return false so we never enter our while..loop at all.

No length

You'll notice the IEnumerator interface has nothing to say about the number of items in the enumerable. There's no Count or Length property, for instance. There are interfaces that support this (e.g., IReadOnlyCollection<T>), but the IEnumerable<T> can only take you to each item; it can't tell you a priori how many items it will enumerate.

If it walks like a duck and talks like a duck..

If you know something about interfaces and how they relate to boxing allocations, you might be concerned about a statement like

foreach (var foo in foos)

being transformed to

IEnumerator<T> enumerator = foo.GetEnumerator();
while (enumerator.MoveNext()) { ... }

Because GetEnumerator either returns a new reference type (creating garbage) or the struct it returns gets boxed into an IEnumerator<T> (creating garbage). In early C#, this is what happended. That's bad.

Duck typing allows the compiler to say "if this thing looks like an IEnumerable, that is, it has a method called GetEnumerator that looks like an IEnumerator, then I'll use it like one instead of boxing it to an IEnumerator". This is a rare case of "duck typing" in C#, where, if the object looks like an enumerable, and talks like an enumerable, then it can be used as one even if it doesn't implement IEnumerable.

Example:

class Foo
{
public int Count { get; }
public Bar this[int index] { get; }
public Enumerator GetEnumerator() => new Enumerator(this);
public struct Enumerator
{
int m_Index;
Foo m_Foo;
internal Enumerator(Foo foo)
{
m_Index = -1;
m_Foo = foo;
}
public bool MoveNext() => ++m_Index < m_Foo.Count;
public Bar Current => m_Foo[m_Index];
public void Reset() => m_Index = -1
public void Dispose() {}
}
}

Now I should be able to write

var foo = new Foo();
foreach(var bar in foo)
{
// bar is an item in foo
}

Notice we didn't have to actually implement IEnumerable or IEnumerator anywhere. The compiler sees the foreach and looks for an IEnumeratble-like object, namely, can I call GetEnumerator() on it and get back an object on which I can then call MoveNext(), Current, Reset(), and Dispose()? If I can, then, great: I can compile this.

Notice our Foo.Enumerator is a struct. That means the compiler doesn't need to make any GC allocations at all when we do a foreach. In fact, a foreach is more like if we wrote this:

using (var enumerator = foos.GetEnumerator())
{
while (enumerator.MoveNext())
{
var foo = enumerator.Current;
// do something with foo
}
}

That's very similar to what I wrote before, but this time, instead of enumerator being an IEnumerator<T>, it's just var. So if the result of GetEnumerator() is a struct, it doesn't get boxed into an IEnumerator<T>; it just gets used as-is.

foreach is Okay again

There was a long time where it was considered best-practice in Unity to always avoid foreach statements, because they always boxed the IEnumerator (and maybe the IDispoable at the end). I get the sense that a lot of veterans still avoid foreach within Unity due to its past history of creating GC garbage.

That's a shame, because foreach can prevent one of the most common bugs in programming: index-out-of-bounds bugs. After all, if there's no index, there's no out of bounds.

But, List<T> implements IEnumerator<T>, so won't its GetEnumerator return a GC'd reference type? Nope!

List<T>.GetEnumerator() returns a specialized struct for exactly this reason. So foreach on a List doesn't allocate for two reasons:

  1. List<T>.Enumerator is a struct (so creating one doesn't allocate)
  2. C# employs duck-typing (so structs don't get boxed)

The real implementation of List<T> looks more like this (skipping some input validation for simplicity):

public class List<T>
{
T[] m_Items;
public int Count { get; private set; }
public int Capacity { get; private set; }
public T this[int index]
{
get => m_Items[index];
set => m_Items[index] = value;
}
public Enumerator GetEnumerator();

public struct Enumerator
{
int m_Index;
List<int> m_List;
public Enumerator(List<T> list)
{
m_List = list;
m_Index = -1;
}
public bool MoveNext() => ++m_Index < Count;
public T Current => m_Items[m_Index];
public Reset() => m_Index = -1;
public void Dispose() {}
}
}

So, finally, we can see how this can work with foreach in a non-allocating way. When we write

var list = new List<T>();
foreach (var item in list) { ... }

The compiler generates

using (List<T>.Enumerator enumerator = list.GetEnumerator())
{
while (enumerator.MoveNext())
{
T item = list.Current;
// ...
}
}

So no GC allocations, no boxing; it's the same as if you just wrote a basic for..loop, but safer.

Explicit interface implementation

But, the real List<T> actually implements the IEnumerator<T> interface. How does that work? We have

public interface IEnumerable<T>
{
IEnumerator<T> GetEnumerator();
}

and whenever you implement an interface in C#, all its members must be public, so the GetEnumerator() from the IEnumerator<T> interface would conflict with our custom GetEnumerator() that returns a custom struct, right? Right?

Enter explicit interface implementation. Interface methods do not have to be public, if they are "explicit". So the real List<T> looks more like this:

public class List<T> : IEnumerable<T>
{
T[] m_Items;
public int Count { get; private set; }
public int Capacity { get; private set; }
public T this[int index]
{
get => m_Items[index];
set => m_Items[index] = value;
}
public Enumerator GetEnumerator();
Enumerator.GetEnumerator() => GetEnumerator();
Enumerator.GetEnumerator() => GetEnumerator();

public struct Enumerator : IEnumerator<T>
{
int m_Index;
List<int> m_List;
public Enumerator(List<T> list)
{
m_List = list;
m_Index = -1;
}
public bool MoveNext() => ++m_Index < Count;
public T Current => m_Items[m_Index];
object IEnumerator.Current => Current;
public Reset() => m_Index = -1;
public void Dispose() {}
}
}

Now, the List<T> can have it both ways: it implements the IEnumerator<T> interface, which means two things:

  1. You understand its intended use, i.e., that it has things that can be enumerated
  2. You can get an IEnumerator<T> if you really need to

But, it also implements the duck-typed version of GetEnumerator, so it can be used in a foreach statement without creating garbage. foreach will call the public GetEnumerator method that returns a struct, but if you need to treat it as an IEnumerator<T>, then you can invoke that interface method too.

Explicit interface implementations are neat; I use them when I need to be able to call a method on a generic T, but I don't want that method available on a concrete type. That may sound a bit weird, but consider this case:

I have a reference counted object with retain/release semantics (looking at you, Objective-C), so I have Retain and Release methods on that object. It is common for me to "retain" such an object while I use it, then "release" it when I am done. E.g.,

myObject.Retain();

if (foo) return; // oops...forgot to release it!
if (bar) doSomething(myObject);

myObject.Release(); // make sure to release that resource when I'm done

Wouldn't it be nice if I could just wrap that in a using? That is, I want to write

myObject.Retain();
using (myObject)
{
if (foo) return; // Auto-released!
if (bar) doSomething(myObject);

} // don't worry about releasing it

But to do this, we have to implement IDisposable. So maybe my object looks like this:

public struct Foo : IDisposable
{
public void Retain() { ... }
public void Release() { ... }
public void Dispose() => Release();
}

Okay, cool, so now I can Retain and Release it but also use it in a using statement. That's convenient, but I don't really want Dispose() to be part of the public interface. "Release" just means we've decremented the reference count; the object may or may not be alive. "Dispose" usually means you shouldn't use the object afterwards because it has been, well, disposed. But if Dispose and Release are the same thing, then it's weird because Dispose() doesn't necessarily dispose, and it's the same as Release(), and now I'm confused.

What we really want to do is hide Dispose but implement it for the convenience of the using statement.

With explicit interface implementations, we can make Dispose() private:

public struct Foo : IDisposable
{
public void Retain() { ... }
public void Release() { ... }
void IDisposable.Dispose() => Release();
}

Now, we can't call Foo.Dispose() directly, but if we have an IDisposable, and it happens to be a Foo, we can still call Dispose on it. Not talking boxing here, either. Consider this:

static void DisposeIt<T>(T bar) where T : IDisposable
{
bar.Dispose();
}

We can call DisposeIt on our Foo because our Foo is IDisposable, and our method DisposeIt doesn't "know" whether bar's Dispose() method is private or not; it just has one because it implements an IDisposable. T can even be a struct -- there's no boxing conversion necessary.

How to Enumerate

At Unity, I find myself implementing lots of types that can be enumerated. You might even say they are enumerable. The canonical form I use is something like this

public struct FooCollection : IEnumerable<Foo>
{
public Enumerator GetEnumerator() => new Enumerator(this);
IEnumerator IEnumerator.GetEnumerator() => GetEnumerator(); // boxing conversion if used, but required by IEnumerable interface
IEnumerator<Foo> IEnumerator<Foo>.GetEnumerator() => GetEnumerator(); // boxing conversion if used, but required by IEnumerable interface

public struct Enumerator : IEnumerator<Foo>
{
FooCollection m_Collection;
FooIterator m_Iterator; // e.g., int
internal GetEnumerator(FooCollection collection)
{
m_Collection = collection;
m_Iterator.Initialize(); // whatever one-before-the-first is
}
public bool MoveNext()
{
m_Iterator.GoToTheNextOne();
return m_Iterator.IsValid();
}
public Foo Current => m_Collection.GetAt(m_Iterator);
object IEnumerator.Current => Current; // required by IEnumerator interface, but doesn't need to be public
public void Reset() => m_Iterator.Reset(); // (or throw if not possible)
public void Dispose() => m_Iterator.Dispose(); // usually not necessary
}
}

As an added bonus, the Current getter probably calls some indexer method on the FooCollection, but since you probably know the length already, you can call a fast-path method that avoids any runtime out-of-bounds checks. In some cases this can even be faster than a for...loop using a regular int indexer.

You can also get more specific. If your collection does know the number of items in it (many do), then you should at least implement IReadOnlyCollection<T>, which is just an IEnumerable<T> with a Count property. There's also ICollection<T>, IList<T>, IDictionary<TKey, TValue>, and so on.

NativeArray

Unity's NativeArray<T> is a good example of this. The NativeArray is itself a struct, it implements IEnumerable<T>, and it does the duck-typing thing with a custom IEnumerator<T>.