-
Notifications
You must be signed in to change notification settings - Fork 58
2010 09 04 the defaultifempty linq query operator
Published on September 4th, 2010 at 19:43
The DefaultfEmpty
LINQ query operator is a strange beast. According to its documentation, it:
Returns the elements of the specified sequence or the type parameter's default value in a singleton collection if the sequence is empty.
What? I thought DefaultIfEmpty
had to do something with joins?
Let’s dig into this; first, let’s analyze what the documentation says. When applied to a non-empty sequence, such as {1, 2, 3}
, DefaultIfEmpty
will return an equivalent sequence: {1, 2, 3}
. When applied to an empty sequence, {}
, it will return a singleton sequence containing the default value of the sequence’s element type: {0}
(assuming the original sequence was an empty sequence of integers).
But what would anyone use that for?
When you think about this question, it turns out that DefaultIfEmpty
is quite useful when sequences are combined; for instance, with additional from
clauses. Consider the following example:
var query = from c in Cooks
from a in c.Assistants
select new { c, a };
The second from
clause in this query (this is really a SelectMany
query operator, when C#’s syntactic sugar is removed) causes a Cartesian product to be formed from the elements of the Cooks
sequence and the elements of the sequences returned by the c.Assistants
expression, which is evaluated once for each c
.
Consider, for example, the following input data:
Cooks = { c1, c2 }
c1.Assistants = { a1, a2 }
c2.Assistants = { a3, a4 }
With this as input, the result of the query will be as follows:
{ c1, a1 }, { c1, a2 },{ c2, a3 }, { c2, a4 }
Consider, now, one of the Assistants
sequences to be empty:
Cooks = { c1, c2 }
c1.Assistants = { a1, a2 }
c2.Assistants = { }
In this case, the result will be:
{ c1, a1 }, { c1, a2 }
As you can see, the result contains no entries for c2
– the Cartesian product discards items from the left sequence for which there is no item in the right sequence. Here is, where DefaultIfEmpty
comes in handy. We can rewrite the query as follows:
var query = from c in Cooks
from a in c.Assistants.DefaultIfEmpty()
select new { c, a };
Now, the DefaultIfEmpty
operator will guarantee that the right sequence always contains at least one element, and therefore, each of the elements of the Cooks collection is contained in the result set at least once:
{ c1, a1 }, { c1, a2 }, { c2, null }
If you know your SQL, this will indeed remind you strongly of the concept of left outer joins, and in fact, the MSDN documentation for DefaultIfEmpty
does hint at this in its Remarks section:
This method can be used to produce a left outer join when it is combined with the
GroupJoin
) [sic] method.
It’s true, DefaultIfEmpty
can be combined with a GroupJoin
, to form something very similar to a SQL left outer join:
var query = from c in Cooks
join a in Cooks on c equals a.Assisted into assistants
from a in assistants.DefaultIfEmpty()
select new { c, a };
But as shown above, DefaultIfEmpty
can also be very useful when applied to sequences other than a GroupJoin
’s result.
And, of course, it can also be used in contexts outside of joins or cartesian products:
var query = Cooks.DefaultIfEmpty ();
// This should really be done using the Any() operator
var query = from c in Cooks
where c.Assistants.DefaultIfEmpty ().First() != null
select c;
These situations don’t really match the concept of a left outer join (at first sight), but they’re still perfectly legal usages of DefaultIfEmpty
. All things considered, the MSDN authors were probably right about their initial assessment. Generally speaking, that query operator simply:
Returns the elements of the specified sequence or the type parameter's default value in a singleton collection if the sequence is empty.
Exactly.
- Fabian