-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coroutines #53
Closed
Closed
Coroutines #53
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,337 @@ | ||
- Start Date: (fill me in with today's date, 2014-04-24) | ||
- RFC PR #: (leave this empty) | ||
- Rust Issue #: (leave this empty) | ||
|
||
# Summary | ||
|
||
Add "shallow" coroutines similar to Python's generators. | ||
|
||
# Motivation | ||
|
||
This feature would simplify implementation all sorts of code, that needs to have "push" interface, | ||
but would rather be "pull" internally. | ||
Examples of such code include: collection iterators, lexers, parsers, servers that handle large numbers of long-running concurrent requests, etc. | ||
|
||
# Drawbacks | ||
|
||
Extra language complexity? | ||
|
||
# Detailed design | ||
|
||
### New keywords | ||
This proposal introduces two new Rust keywords: **coro** and **yield**. | ||
|
||
"yield" already a reserved keyword, and while, in principle, adding "coro" can be avoided(+), I think that co-routine behavior is different enough from regular lambdas to merit a new keyword for them. | ||
|
||
(+) We could use the regular lambda syntax and infer that one is a co-routine by the presence of **yield** in its' body, the same way it's done in Python. | ||
|
||
### Syntax | ||
```text | ||
coro_expr : 'coro' [ '(' [ param_list ] ')' [ '->' type ] ] block ; | ||
param_list : param [',' param ]* ; | ||
param : ident [ ':' type ] ; | ||
``` | ||
Co-routine declaration may appear in the same places as "normal" lambda functions. Top-level functions cannot be co-routines because co-routines need an environment block, just like lambdas. | ||
|
||
Just like normal lambdas, co-routines may close over variables of the containing function. | ||
|
||
Co-routine body may include the same statements, as for a normal lambda function, with an addition of **yield** expression: | ||
```text | ||
yield_expr : 'yield' [ expr ] ; | ||
``` | ||
|
||
### Example | ||
|
||
```rust | ||
fn main() { | ||
let coroutine = coro (a1:A1, a2:A2, a3: A3) { | ||
... | ||
let mut i = 10 | ||
while i >= 0 { | ||
... | ||
let (b1,b2,b3) = yield y1; | ||
... | ||
let (c1,c2,c3) = yield y2; | ||
... | ||
i -= 1; | ||
} | ||
... | ||
return result; | ||
} | ||
|
||
// Consumer | ||
let mut result = coroutine(a1, a2, a3); | ||
while result.is_yield() { | ||
// ... process result | ||
result = coroutine(b1, b2, b3); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can't we just extend |
||
} | ||
} | ||
``` | ||
|
||
### Semantics | ||
The type of the closure in the example above is ```fn(A1, A2, A3) -> CoResult<Y, R>```, where: | ||
* A1,A2,A3 are the types of co-routine arguments, | ||
* a tuple (A1,A2,A3) is the return type of **yield** expressions in the co-routine body, | ||
* Y is the inferred super-type of all yielded values, | ||
* R is the inferred super-type of all returned values (including the tail expression), | ||
* CoResult is defined as follows: | ||
```rust | ||
enum CoResult<Y,R> { | ||
Yield<Y>, | ||
Return<R> | ||
} | ||
|
||
impl CoResult<Y,R> { | ||
fn is_yield() -> bool {...} | ||
fn is_return() -> bool {...} | ||
fn unwrap_yield() -> Y {...} | ||
fn unwrap_return() -> R {...} | ||
} | ||
``` | ||
|
||
The first time `coroutine` is called, execution starts at the top, and the passed parameters are assigned to a1, a2 and a3. When execution reaches the first **yield** expression, control is returned to the caller, and the result is `Yield(y1)`. | ||
|
||
The second time `coroutine` is called, execution resumes immediately after the last executed **yield** expression, the value of which will be a tuple of parameters passed in by the caller. | ||
|
||
And so on. | ||
|
||
When execution reaches a **return** statement, of falls off the end of the co-routine body, it returns for the last time, passing back returned value wrapped in Return(), i.e. `Return(result)`. | ||
|
||
Further attempts to invoke `coroutine` shall cause a task failure. | ||
|
||
### 'Physics' of co-routines | ||
|
||
The above example is translated into something like this (assume for a second that Rust supports **goto** statement): | ||
```rust | ||
struct Closure { | ||
state : int; | ||
i : int; | ||
// closed-over variables of the containing function (upvars) also go here | ||
} | ||
|
||
impl Closure { | ||
pub fn call(&mut self, a1:A1, a2:A2, a3: A3) { | ||
match (self.state) { | ||
0 => goto state_0, | ||
1 => goto state_1, | ||
2 => goto state_2, | ||
_ => fail!("invalid state") | ||
} | ||
state_0: | ||
... | ||
self.i = 10; | ||
while self.i >= 0 { | ||
... | ||
self.state = 1; | ||
return Yield(y1); | ||
state_1: | ||
let (b1,b2,b3) = (a1, a2, a3); | ||
... | ||
self.state = 2; | ||
return Yield(y2); | ||
state_2: | ||
let (c1,c2,c3) = (a1, a2, a3); | ||
... | ||
self.i -= 1; | ||
} | ||
... | ||
self.state = -1; | ||
return Return(result); | ||
} | ||
} | ||
``` | ||
And on the caller side: | ||
``` | ||
let coroutine = ~Closure { state: 0 }; | ||
|
||
let mut result = coroutine.call(a1, a2, a3); | ||
while result.is_yield() { | ||
// ... process result | ||
result = coroutine.call(b1, b2, b3); | ||
} | ||
``` | ||
|
||
### Implementation notes | ||
|
||
I believe that for the most part Rust compilation passes may treat co-routines just like normal lambdas, and | ||
in the livenses checking, borrow checking, type inference, etc, passes **yield** expressions may be treated similarly to function calls. | ||
|
||
Changes in type inference pass: | ||
* Types of \<expr\> in all **yield** expressions are sub-typed to the Y type parameter of the co-routine return value. | ||
* Types of \<expr\> in all **return** statements are sub-typed to the R type parameter of the co-routine return value. | ||
|
||
Changes in IR generation: | ||
* 'state' variable is added into the closure. | ||
* Local variables whose lifetime straddles any **yield** expression are hoisted into the closure. Note that this only moves their storage location, lifetimes stay intact. | ||
* A "master switch" is added at the top of the function to transfer control to the right location, according to current state. | ||
* **yield** expressions are transformed into the equivalent of | ||
``` | ||
self.state = <N>; | ||
return Yield(<expr>); | ||
state_<N>: | ||
``` | ||
|
||
* **return** statements and the tail expression are translated into `return Return(<expr>)` | ||
|
||
### Once-ness | ||
|
||
Although, superficially, it would seem that co-routine closures are invoked multiple times, semantically this is not so, because resumptions continue at the point where execution was interrupted. In this regard co-routines would be similar to `once fn`'s and should be able to move variables out of their environment. | ||
|
||
### Cleanup | ||
|
||
When a "normal" lambda closure goes out of scope, Rust runs destructors for all of closure's fields. | ||
With co-routines, liveness of locals hoisted into the closure depends on its' current state. | ||
If a co-routine runs to completion, all is well, because locals will have been disposed of in the course of normal execution. | ||
However when co-routine closure gets destroyed "prematurely", some extra clean-up will be needed: | ||
|
||
``` | ||
impl Drop for Closure { | ||
fn drop(&mut self) { | ||
match (self.state) { | ||
1 => { /* clean-up locals alive around state_1 */ }, | ||
2 => { /* clean-up locals alive around state_2 */ } | ||
... | ||
_ => () | ||
} | ||
} | ||
} | ||
|
||
``` | ||
|
||
|
||
# Alternatives | ||
|
||
It is possible to implement similar functionality via monads, or, I suspect, rather, something similar to | ||
[F# computation expressions](http://msdn.microsoft.com/en-us/library/dd233182.aspx) (because Rust has | ||
control flow statements, which aren't functions). | ||
|
||
Note that even in F#, the built-in sequence expressions are implemented as a state machine similar to the above; | ||
presumably because the compiler is not Smart Enough(tm) to optimize a sphagetti of lambda functions into a | ||
state machine. | ||
|
||
# Unresolved questions | ||
|
||
- For iterators we'd want to return unboxed closures. Can we haz unboxed closures? | ||
- What is the syntax for heap-allocated coroutines? `~coro() {...}`? `coro proc() {...}`? `box coro() {...}`? | ||
|
||
Hopefully, the impending closure reform will resolve these issues for regular lambdas, and coroutines can piggy-back on that design. | ||
|
||
|
||
# More examples | ||
|
||
The foregoing assumes that closure reform had resulted in syntax similar to [this](http://glaebhoerl.tumblr.com/rust_closure_types), | ||
i.e. lambdas implement `trait Fn<Arg1, Arg2, ..., Ret>`. | ||
|
||
### Iterators | ||
|
||
With a help of the following adapter, | ||
```rust | ||
impl<T> Iterator<T> for FnMut<CoResult<T,()>> { | ||
fn next(&mut self) -> Option<T> { | ||
match self.call() { | ||
Yield(x) => Some(x), | ||
Return(*) => None | ||
} | ||
} | ||
} | ||
``` | ||
... we can implement collection iterator in "procedural" style: | ||
```rust | ||
impl<'self,T> ImmutableVector<'self, T> for &'self [T] { | ||
fn iter(self) -> Iterator<'self, T> { | ||
coro { | ||
let mut i = 0; | ||
while i < self.len() { | ||
yield self[i]; | ||
} | ||
} | ||
} | ||
} | ||
``` | ||
|
||
### Double-ended iterators | ||
|
||
Similarly, a double ended iterator can be implemented as follows: | ||
```rust | ||
enum IterEnd { | ||
Head, | ||
Tail | ||
} | ||
|
||
impl<T> DoubleEndedIterator<T> for FnMut<IterEnd, CoResult<T,()>> { | ||
fn next(&mut self) -> Option<T> { | ||
match self.call(Tail) { | ||
Yield(x) => Some(x), | ||
Return(*) => None | ||
} | ||
} | ||
|
||
fn next_back(&mut self) -> Option<T> { | ||
match (*self)(Head) { | ||
Yield(x) => Some(x), | ||
Return(*) => None | ||
} | ||
} | ||
} | ||
|
||
impl<'self,T> ImmutableVector<'self, T> for &'self [T] { | ||
fn iter(self) -> DoubleEndedIterator<'self, T> { | ||
coro(which_end: IterEnd) { | ||
let mut i = 0; | ||
let mut j = self.len(); | ||
while i < j { | ||
match which_end { | ||
Tail => { | ||
which_end = yield self[i]; | ||
i += 1; | ||
}, | ||
Head => { | ||
j -= 1; | ||
which_end = yield self[j]; | ||
} | ||
} | ||
} | ||
} | ||
} | ||
} | ||
``` | ||
|
||
### Asynchronous I/O | ||
|
||
This presumes existence of Awaitable<T> trait, which encapsulates a pending async IO operation, | ||
as well as existence of an event loop that select()'s on an array of Awaitable's and dispatches IO completions | ||
to the corresponding callbacks. | ||
|
||
(If you are familiar with C# asyncs, think `yield` == `await` and `Awaitable<T>` == `Task<T>`). | ||
|
||
```rust | ||
|
||
type AsyncIO<T> = FnMut< CoResult<Awaitable, IOResult<R>> >; | ||
|
||
fn copy_async(from: AsyncReader, into: AsyncWriter, buffer_size: uint) -> AsyncIO<i64> | ||
{ | ||
coro { | ||
let mut total: i64 = 0; | ||
let buffer = ~[u8, ..buffer_size]; | ||
loop { | ||
// AsyncReader.read_async() returns Awaitable<IOResult<i64>> | ||
let read_result = yield from.read_async(buffer); | ||
match read_result { | ||
Err(err) => return Err(err), | ||
Ok(count) => { | ||
total += count; | ||
// AsyncWriter.write_async() returns Awaitable<IOResult<()>> | ||
yield into.write_async(buffer.slice(0, read_count)); | ||
} | ||
} | ||
} | ||
total | ||
} | ||
} | ||
|
||
fn start_async_copy() { | ||
... | ||
event_loop.register(copy_async(from, into, 1024)); | ||
... | ||
} | ||
``` |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thestinger confused me a bit about the subject, but my last findings suggest Python is misusing the word "coroutines", which traditionally are a superset of "generators" (the difference being the ability to
yield
through multiple stack frames).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, that's why I called them "shallow" coroutines.