Skip to content

2010 09 04 the defaultifempty linq query operator

Fabian Schmied edited this page Sep 4, 2010 · 1 revision

Published on September 4th, 2010 at 19:43

The DefaultIfEmpty LINQ query operator

The DefaultfEmpty LINQ query operator is a strange beast. According to its documentation, it:

Returns the elements of the specified sequence or the type parameter's default value in a singleton collection if the sequence is empty.

What? I thought DefaultIfEmpty had to do something with joins?

Let’s dig into this; first, let’s analyze what the documentation says. When applied to a non-empty sequence, such as {1, 2, 3}, DefaultIfEmpty will return an equivalent sequence: {1, 2, 3}. When applied to an empty sequence, {}, it will return a singleton sequence containing the default value of the sequence’s element type: {0} (assuming the original sequence was an empty sequence of integers).

But what would anyone use that for?

When you think about this question, it turns out that DefaultIfEmpty is quite useful when sequences are combined; for instance, with additional from clauses. Consider the following example:

var query = from c in Cooks
            from a in c.Assistants
            select new { c, a };

The second from clause in this query (this is really a SelectMany query operator, when C#’s syntactic sugar is removed) causes a Cartesian product to be formed from the elements of the Cooks sequence and the elements of the sequences returned by the c.Assistants expression, which is evaluated once for each c.

Consider, for example, the following input data:

Cooks = { c1, c2 }  
c1.Assistants = { a1, a2 }  
c2.Assistants = { a3, a4 }

With this as input, the result of the query will be as follows:

{ c1, a1 }, { c1, a2 },{ c2, a3 }, { c2, a4 }

Consider, now, one of the Assistants sequences to be empty:

Cooks = { c1, c2 }  
c1.Assistants = { a1, a2 }  
c2.Assistants = { }

In this case, the result will be:

{ c1, a1 }, { c1, a2 }

As you can see, the result contains no entries for c2 – the Cartesian product discards items from the left sequence for which there is no item in the right sequence. Here is, where DefaultIfEmpty comes in handy. We can rewrite the query as follows:

var query = from c in Cooks
            from a in c.Assistants.DefaultIfEmpty()
            select new { c, a };

Now, the DefaultIfEmpty operator will guarantee that the right sequence always contains at least one element, and therefore, each of the elements of the Cooks collection is contained in the result set at least once:

{ c1, a1 }, { c1, a2 }, { c2, null }

If you know your SQL, this will indeed remind you strongly of the concept of left outer joins, and in fact, the MSDN documentation for DefaultIfEmpty does hint at this in its Remarks section:

This method can be used to produce a left outer join when it is combined with the GroupJoin) [sic] method.

It’s true, DefaultIfEmpty can be combined with a GroupJoin, to form something very similar to a SQL left outer join:

var query = from c in Cooks
            join a in Cooks on c equals a.Assisted into assistants
            from a in assistants.DefaultIfEmpty()
            select new { c, a };

But as shown above, DefaultIfEmpty can also be very useful when applied to sequences other than a GroupJoin’s result.

And, of course, it can also be used in contexts outside of joins or cartesian products:

var query = Cooks.DefaultIfEmpty ();

// This should really be done using the Any() operator
var query = from c in Cooks
            where c.Assistants.DefaultIfEmpty ().First() != null
            select c;

These situations don’t really match the concept of a left outer join (at first sight), but they’re still perfectly legal usages of DefaultIfEmpty. All things considered, the MSDN authors were probably right about their initial assessment. Generally speaking, that query operator simply:

Returns the elements of the specified sequence or the type parameter's default value in a singleton collection if the sequence is empty.

Exactly.

- Fabian

Clone this wiki locally