-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C# Design Notes for Dec 7 and Dec 14, 2016 #16709
Comments
Previous comment of mine: #15619 (comment) TL;DR Why do the translations need to nest the range variables in some kind of container? That's an unnecessary allocation. Why not just project the result of the expression as an inutterable range variable? That eliminates an allocation and a projection. strings
.Select(s => new { s = s, <r>_w = int.TryParse(s, out int i), i = i })
.Where(p => p.<r>_w) That said I do like the idea of emitting an optimized form specifically for strings
.Select(s => int.TryParse(s, out int i) ? new { s, i } : null)
.Where(p => p != null) Hopefully LINQ will adopt tuples for projections in the future which would theoretically make that moot and take care of some of the performance concerns in general. I can see where definite assignment would be a concern. My opinion would be that the introduced variable is mutable within the expression and that it is always projected to an immutable range variable. However, if the expression that declares the variable does not definitely assign it then it would be an error to reference it further in the query. I think that would afford enough flexibility to make it useful for the majority of scenarios with an escape hatch for more hairy situations. |
I disagree with all of the decisions here except for the do-while.
|
I must respectfully disagree with each statement in the paragraph. I have had very different experiences. |
Reports of query expressions death are greatly exaggerated. My experience is quite the opposite. I see them used significantly more often than their method counterparts, especially by developers who are first embracing LINQ. Query expressions already have special syntax for projecting additional range variables, this fits right in with that. You're free to continue using the method syntax and you're free to manage and project any pattern variables or out declaration variables manually. |
@MgSam I think that it really depends on what you need and who you are, people that aren't familiar with functional programming and might not understand what projection even means may find query expressions very appealing. It's easy to forget that not all programmers are engineers, not all of them have CS degree, not all of them come with a mathematical background, some people learnt IT and moved into programming so they lack quite a bit of math courses (at least in my uni there's a big difference) and finally there are these that are self-taught, there are plenty of web programmers that are using C# and are actually self-taught so for these people query expressions might be a good starting point. Just a simple
To many people the latter version would make a lot more sense than the former. |
Like others here, I disagree with your views on linq. I also, to an extent, disagree with @eyalsk's "it's syntax for beginners" views. I have been using linq since it was first introduced and learned both syntax forms. I also (like to think) I have a reasonable understanding of functional programming. Yet, I use the query syntax by default, preferring its expressiveness when compared to method chaining. I only fallback to method chaining when the resultant query becomes too cumbersome, or when I need features not offered by the query syntax. |
Would you mind clarifying, the differences between the conclusion in #16640, regarding irrefutable patterns, that "This seems harmless, and will grow more useful over time. It's a small tweak that we should do." and these design notes. My interpretation of this, is that it was felt a good idea in October, but upon revisiting the matter, it's been decided that it "is not worth making special affordances for" these irrefutable patterns and the language rules will remain as-is. Have I got that right, or got myself muddled, please? |
Like @DavidArno, I prefer it for its visual aesthetics and readability. I'll type whichever is quicker and simpler and that's often query syntax. |
My point of view here is pragmatic, not ideological. It is undeniable that other languages do not have query syntax and get along fine without it. It is undeniable that query syntax has been gimped since the day it was created, as it can access only a small subset of the overall expressiveness of linq. It is undeniable that query syntax offers no additional functionality over the alternative syntax. Whether you think it looks beautiful and elegant or not, and given these facts, how is it worth it spending time designing for it when there are far more valuable features the team could be working on? Is using expression variables in query expressions really a feature that the design team should be spending capital on? This goes back to the argument I've been making for a long time on these forums- the prioritization of the C# team the past few years has been awful. They meander from feature to feature, seemingly without direction, and spend inordinate amounts of time working on minor features that will have almost no real world benefits. |
@MgSam Far more valuable than saving time and keystrokes? If saving time and keystrokes isn't pragmatic, I don't know what is! 😆 |
@MgSam I don't think anyone would disagree that query syntax can only represent a subset of the overall expressiveness of LINQ, but we could say the same about I would like to see query syntax expanded, not deprecated. Fix the areas that are painful or otherwise come up short compared with method syntax (#100, #1938, #3486, #3571, #6877, #8221, #9273, #15638, etc.). |
Would having these optimizations actually require them being mandated by the language? I believe the situation with the two forms of LINQ is:
It seems to me that making the optimization for |
@MgSam You might be interested to read this recent article analyzing LINQ usage in GitHub projects. My conclusion based on that data is that method syntax is used about twice as much as query syntax (compare the numbers for |
@DavidArno Where did I say that it's a syntax for beginners? 😆 ❤️ I said that it might be good starting point for people that don't understand or don't know how to use the alternative! |
My apologies: I misunderstood what you were saying therefore. |
I think the analysis that method syntax is used more often than query syntax doesn't tell us much. The two are not equivalent. Certain things are only possible with method syntax (.ToList, First, etc), some things are just nicer/shorter in method syntax (eg. just do a .Where(x -> x) whereas with query syntax you always need the closing select. Other things are much nicer in query syntax, mostly multiple selects as shown above and certainly let statements, which are such a pain to do with method syntax. The point is it's not just user preference. Personally I try to use one style for each query, but I do use both. Given the choice I would prefer to use query syntax as IMO it just reads nicer. |
I still can't wrap my head around how to do 'joins' without using teh convenient query syntax. I find the method-form incredibly difficult to wrap my head around. |
@CyrusNajmabadi LINQ's motto should be when in doubt use/master Didn't check performance but these should be similar in terms of results:
|
@eyalsk meh, the join (with the opportunity to switch to hash matching) will scale far better than the nested loops version. At least that's true for the join algos I've written. |
@jnm2 Yeah probably. :) |
@eyalsk I ran that exact query and consistently get around 2-4x better performance with |
To be clear there's nothing wrong with what @eyalsk suggested if that happens to be the easiest way for you to think about it. No point in wasting time optimizing for perf unless it ends up on a hot path. Personally, I find the |
Guys, it was meant to be a joke, kinda but thanks for the elaboration. :) |
What's a joke |
@DavidArno regarding irrefutable patterns: the difference between the decision in #16640 and here is that the one in #16640 was about definite assignment, and was a small tweak to it. Furthermore it enables useful code that was otherwise prohibited. The one here is about reachability, which currently very clearly only takes the value of constant expressions into account. We don't want to break with that principle just for this example of limited usefulness. Furthermore it would introduce a new diagnostic, not allow more code to work. |
Thanks for the clarification, @MadsTorgersen. Makes sense to me now. |
LDM notes for Dec 7 and Dec 14 2016 are available at /~https://github.com/dotnet/csharplang/blob/master/meetings/2016/LDM-2016-12-07-14.md |
C# Language Design Notes for Dec 7 and Dec 14, 2016
Agenda
Expression variables in query expressions
It seems desirable to allow expression variables in query clauses to be available in subsequent clauses:
The idea is that the
i
introduced in thewhere
clause becomes a sort of extra range variable for the query, and can be used in theselect
clause. It would even be definitely assigned there, because the compiler is smart enough to figure out that variables that are "definitely assigned when true" in a where clause expression would always be definitely assigned in subsequent clauses.This is intriguing, but when you dig in it does raise a number of questions.
Translation
How would a query like that be translated into calls of existing query methods? In the example above we would need to split the
where
clause into a call toSelect
to compute both the boolean result and the expression variablei
, then a call toWhere
to filter out those where the boolean result was false. For instance:That first
Select
call is pretty unappetizing. We can do better, though, by using a trick: since we know that the failure case is about to be weeded out by theWhere
clause, why bother constructing an object for it? We can just null out the whole anonymous object to signify failure:Much better!
Other query clauses
We haven't really talked through how this would work for other kinds of query clauses. We'd have to go through them one by one and establish what the meaning is of expression variables in each expression in each kind of query clause. Can they all be propagated, and is it meaningful and reasonable to achieve?
Mutability
One thing to note is that range variables are immutable, while expression variables are mutable. We don't have the option of making expression variables mutable across a whole query, so we would need to make them immutable either:
Having them be mutable inside their own query clause would allow for certain coding patterns such as:
Here
i
is introduced and then mutated in the same query clause.The above translation approaches would accommodate this "mutable then immutable" semantics if we choose to adopt it
Performance
With a naive query translation scheme, this could lead to a lot of hidden allocations even when an expression variable is not used in a subsequent clause. Today's query translation already has the problem of indiscriminately carrying forward all range variables, regardless of whether they are ever needed again. This feature would exacerbate that issue.
We could think in terms of language-mandated query optimizations, where the compiler is allowed to shed range variables once they are never referenced again, or at least if they are never referenced outside of their introducing clause.
Blocking off
We won't have time to do this feature in C# 7.0. If we want to leave ourselves room to do it in the future, we need to make sure that we don't allow expression variables in query clauses to mean something else today, that would contradict such a future.
The current semantics is that expression variables in query clauses are scoped to only the query clause. That means two subsequent query clauses can use the same name in expression variables, for instance. That is inconsistent with a future that allows those variables to share a scope across query clause boundaries.
Thus, if we want to allow this in the future we have to put in some restrictions in C# 7.0 to protect the design space. We have a couple of options:
The former is a big hammer, but the latter requires a lot of work to get right - and seems at risk for not blocking off everything well enough.
Deconstruction
A related feature request is to allow deconstruction in the query clauses that introduce new range variables:
This, again, would simply introduce extra range variables into the query, and would sort of be equivalent to the tedious manual unpacking:
Except that we could do a much better job of translating the query into fewer calls:
Conclusion
We will neither do expression variables nor deconstruction in C# 7.0, but would like to do them in the future. In order to protect our ability to do this, we will completely disallow expression variables inside query clauses, even though this is quite a big hammer.
Irrefutable patterns and reachability
We could be smarter about reachability around irrefutable patterns:
We could consider being smart, and realizing that the condition is always true, so the else clause is not reachable.
By comparison, though, in current C# we don't try to reason about non-constant conditions:
Conclusion
This is not worth making special affordances for. Let's stick with current semantics, and not introduce a new concept for "not constant, but we know it's true".
Do-while loop scope
In the previous meeting we decided that while loops should have narrow scope for expression variables introduced in their condition. We did not explicitly say that the same is the case for do-while, but it is.
The text was updated successfully, but these errors were encountered: