-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-98831: Remove super-instruction definitions, use macro instructions instead #100124
Conversation
Not gonna lie, the PR title scared me. |
6706827
to
eadca51
Compare
Clearly I should use more provocative PR titles more often. :-) |
#define _COMPARE_OP_FLOAT 1003 | ||
#define _COMPARE_OP_INT 1004 | ||
#define _COMPARE_OP_STR 1005 | ||
#define _JUMP_IF 1006 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these just be zeroes, too? Or even -1, just to emphasize that they aren't real?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, they should all be zeros. Since the value is never used I'm not sure what the point of -1 would be.
Python/generated_cases.c.h
Outdated
TARGET(LOAD_FAST__LOAD_FAST) { | ||
PyObject *_tmp_1; | ||
PyObject *_tmp_2; | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Random idea... any chance we could annotate the output of macros to make it easier to see what the different parts correspond to? So, like, this line would be:
{ | |
{ // LOAD_FAST |
And we would have { // JOIN
and another { // LOAD_FAST
below? Not sure how hard this is.
@@ -115,6 +117,14 @@ dummy_func( | |||
switch (opcode) { | |||
|
|||
// BEGIN BYTECODES // | |||
|
|||
op(JOIN, (word/1 --)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm. It feels weird to treat the next instruction as a cache entry.
Why don't all macro components generate code to reload the incremented opcodes/opargs unconditionally? That way we wouldn't need JOIN
at all. Or are we worried the compiler can't optimize it out when it's unused (or that garbage values could be too confusing)? Or does it mess up something else?
Perhaps there's a clean way of expressing in the DSL whether an op corresponds to a real instruction or not? Then we can just reload the opcode/oparg if that's true, and get rid of JOIN
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm. It feels weird to treat the next instruction as a cache entry.
It's only weird if you think of the instruction stream as alternating instructions and cache entries. In the discussion about registers it was already proposed to use a "cache" entry to encode the operation (ADD, MUL, etc.) for BINARY_OP, so it's really just a stream of variable-length instructions. But yeah, this op is definitely full of internal hackery.
Why don't all macro components generate code to reload the incremented opcodes/opargs unconditionally? That way we wouldn't need JOIN at all. Or are we worried the compiler can't optimize it out when it's unused (or that garbage values could be too confusing)? Or does it mess up something else?
I worry more about where the next oparg would be loaded from. The code generator makes assumption about where next_instr
points (for macros it keeps pointing just past the original instruction until the end, for supers it gets bumped for each component).
Perhaps there's a clean way of expressing in the DSL whether an op corresponds to a real instruction or not? Then we can just reload the opcode/oparg if that's true, and get rid of
JOIN
.
That's not a bad idea -- because we actually have that way: inst
vs op
. The main downside would then be that there would no longer be a clue in the macro
whether it defines a super or not.
if (tkn := self.expect(lx.IDENTIFIER)) and tkn.text == "super": | ||
if self.expect(lx.LPAREN): | ||
if tkn := self.expect(lx.IDENTIFIER): | ||
if self.expect(lx.RPAREN): | ||
if self.expect(lx.EQUALS): | ||
if ops := self.ops(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kill it with fire! ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm feeling a lot of hesitation about this PR. Maybe we should just leave things as they were and focus on other work (e.g. finish converting more instructions to the explicit stack/cache effects form, and adding arrays).
#define _COMPARE_OP_FLOAT 1003 | ||
#define _COMPARE_OP_INT 1004 | ||
#define _COMPARE_OP_STR 1005 | ||
#define _JUMP_IF 1006 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, they should all be zeros. Since the value is never used I'm not sure what the point of -1 would be.
@@ -115,6 +117,14 @@ dummy_func( | |||
switch (opcode) { | |||
|
|||
// BEGIN BYTECODES // | |||
|
|||
op(JOIN, (word/1 --)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm. It feels weird to treat the next instruction as a cache entry.
It's only weird if you think of the instruction stream as alternating instructions and cache entries. In the discussion about registers it was already proposed to use a "cache" entry to encode the operation (ADD, MUL, etc.) for BINARY_OP, so it's really just a stream of variable-length instructions. But yeah, this op is definitely full of internal hackery.
Why don't all macro components generate code to reload the incremented opcodes/opargs unconditionally? That way we wouldn't need JOIN at all. Or are we worried the compiler can't optimize it out when it's unused (or that garbage values could be too confusing)? Or does it mess up something else?
I worry more about where the next oparg would be loaded from. The code generator makes assumption about where next_instr
points (for macros it keeps pointing just past the original instruction until the end, for supers it gets bumped for each component).
Perhaps there's a clean way of expressing in the DSL whether an op corresponds to a real instruction or not? Then we can just reload the opcode/oparg if that's true, and get rid of
JOIN
.
That's not a bad idea -- because we actually have that way: inst
vs op
. The main downside would then be that there would no longer be a clue in the macro
whether it defines a super or not.
Off-line we decided not to do this. Super-instructions may eventually disappear (when we have a register VM). |
Replace all super-instructions with macros, using a special JOIN op to extract the next oparg. JOIN has a cache effect of one word that is equivalent to bumping next_instr.
Rip out all code for parsing and generating super-instructions.