From the Curiosity Cabinet: .ForEach()

Af Marek Vokáč, Opdateret om 7. okt. 2022
push_pin
star

Sometimes you discover somthing in a corner of the code and think, "it must have seemed like a good idea at the time". As you delve into it, you get a sense of why; and at the same time, understand why it wasn't through a somewhat deeper insight. Get a coffee and bear with me while I explain.

This time it's about LINQ in .NET and how it works. I expect everyone who uses C# is familiar with it, and even loves the power. Here in SuperOffice we've generally landed on the fluent-chain-of-methods syntax rather than the sql-like one. To me that's because it saves me from a mental gear-change. But in any case, the resulting code is the same and the same thing happens.

One of the core features of LINQ is deferred execution. When you say

var things = myCollection.Where(x => x.ShouldKeep).Select(x => x.TheThing);

then nothing is actually done - yet. myCollection is not called upon to enumerate its members, no filtering and no projection happens. Only the chain is built. Then, when you say

foreach(var item in things) { ... }

... and only then ... are the operations performed.

This comes from the signatures of the methods - they take an IEnumerable, and return an IEnumerable. The important aspect here is that an IEnumerable is something that can be enumerated/iterated over; but it's not an enumerator/iterator in itself and it certainly doesn't mean the thing-to-be-enumerated is an already-existing collection. It's very powerful.

But this deferral of execution needs to end, sooner or later: we actually want things to happen and we want to see the result. In the case of a foreach loop, we get handed the items one at a time. Another way is to say .ToList(), or .ToArray(), .ToDictionary() (one of my favourites); there are others. Their return types are not IEnumerable - instead they return some result; and their implementations contain a foreach loop that "pulls" on the incoming IEnumerable, thereby triggering the execution of the whole pipeline.

With this background, consider the following code:

    public static IEnumerable ForEach(this IEnumerable enumeration, Action action)
    {
        if (enumeration == null)
            throw new ArgumentNullException("enumeration");
        if (action == null)
            throw new ArgumentNullException("action");

        foreach (var item in enumeration)
        {
            action(item);
            yield return item;
        }
    }

With this in place, you might think it reasonable to write

listOfAppointments.ForEach(m => Assert.IsFalse(String.IsNullOrEmpty(m.AssociateName)));

But... no. The catch here is that the ForEach method does have a loop iterating over the incoming elements, sure... but in turn it yields elements through its return value. Therefore, it's not going to do anything unless the next method in the chain performs an iteration. It simply won't be called, because of the special way IEnumerable works and how the compiler generates code. In the example above, there is no next method; these Asserts were simply never executed. I'm pretty sure the author of that code didn't think that was going to (not?) happen.

Years ago I got yelled at by a partner because of this, and because our .ForEach was in the global namespace and masking his own (better) implementation - which took him some time to figure out.

I recently removed this curious and confusing construction. While LINQ is great for data set processing, I still think that updating elements in place or complex processing can just as well be in an explicit foreach loop. It makes it more clear that this stuff is actually happening to each element. The somewhat contrived "convert foreach to LINQ" example on Microsoft Learn is actually a case of a foreach loop that yields something - obviously a LINQ candidate, but very different from a loop that computes or stores things (what I'd call an action).  Of course this is a matter of opinion. But I think everyone agrees that our little curiosity, a .ForEach that doesn't actually execute by itself, is not what you expect or want.

There. Done, removed, get on with Friday :-)

Marek