-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-98831: Implement basic cache effects #99313
Conversation
Had to refactor the parser a bit for this.
Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and Py_XNewRef() in C files of the Objects/ directory.
Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and Py_XNewRef() in Objects/dictobject.c.
…thon#99280) Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
… venvs (pythonGH-99206) Check to see if `base_executable` exists. If it does not, attempt to use known alternative names of the python binary to find an executable in the path specified by `home`. If no alternative is found, previous behavior is preserved. Signed-off-by: Vincent Fazio <vfazio@gmail.com> Signed-off-by: Vincent Fazio <vfazio@gmail.com>
python#99271) Also mark those opcodes that have no stack effect as such. Co-authored-by: Brandt Bucher <brandtbucher@gmail.com>
PS. I didn't implement a family that actually uses the cache (the 'counter' doesn't count, it's special since it is written, which our DSL doesn't support). But I figured I'd stop here -- keeping these PRs open for a long time is hard work due to merge conflicts. |
Working on the refactor I now know for sure there are some bugs in that part. (EDIT: Fixed in GH-99408 but not here.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe leave checking of families to another PR, and just implement stack effects in this PR?
When you're done making the requested changes, leave the comment: |
I have made the requested changes; please review again (Well, I've answered everything and would like to merge this as-is so Brandt can continue on GH-99399.) |
Thanks for making the requested changes! @markshannon: please review the changes made to this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
I've sprinked a bunch of random notes and questions throughout. Feel free to fix now, later, or never. :)
def outputs(self): | ||
return self.header.outputs | ||
def outputs(self) -> list[StackEffect]: | ||
# This is always true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's always true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isinstance(x, StackEffect)
. It's gone in the next refactor.
for ceffect in cache: | ||
if ceffect.name != "unused": | ||
bits = ceffect.size * 16 | ||
f.write(f"{indent} PyObject *{ceffect.name} = read{bits}(next_instr + {cache_offset});\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that it matters yet, but these are almost always fixed-width integer types, not objects (though we'll eventually want handling for objects too):
f.write(f"{indent} PyObject *{ceffect.name} = read{bits}(next_instr + {cache_offset});\n") | |
f.write(f"{indent} uint{bits}_t {ceffect.name} = read{bits}(next_instr + {cache_offset});\n") |
}; | ||
|
||
|
||
inst(BINARY_OP_MULTIPLY_INT, (left, right, unused/1 -- prod)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just my preference, but I sort of prefer a name like _
to a name like unused
for our syntax here. I guess it just feels more "special", and doesn't distract:
inst(BINARY_OP_MULTIPLY_INT, (left, right, unused/1 -- prod)) { | |
inst(BINARY_OP_MULTIPLY_INT, (left, right, _/1 -- prod)) { |
@@ -193,7 +191,21 @@ dummy_func( | |||
ERROR_IF(res == NULL, error); | |||
} | |||
|
|||
inst(BINARY_OP_MULTIPLY_INT, (left, right -- prod)) { | |||
family(binary_op, INLINE_CACHE_ENTRIES_BINARY_OP) = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I don't think we really need the INLINE_CACHE_ENTRIES_WHATEVER
stuff (or the asserts it produces) in this file anymore (they were originally added to simplify the JUMPBY(...)
moves, but those are going to be generated now).
I feel like it sort of just complicates parsing and code generation for no real benefit... plus, we actually already have asserts to this effect in specialize.c
where we ended up re-using these constants in some places.
family(binary_op, INLINE_CACHE_ENTRIES_BINARY_OP) = { | |
family(binary_op) = { |
static PyObject *value, *value1, *value2, *left, *right, *res, *sum, *prod, *sub; | ||
static PyObject *container, *start, *stop, *v; | ||
static PyObject *container, *start, *stop, *v, *lhs, *rhs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:(
instr, predictions, indent, f, | ||
cache_size=find_cache_size(instr, families) | ||
) | ||
effects_table[instr.name] = len(instr.inputs), len(instr.outputs), cache_offset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. This is a bit weird because it treats caches and stack items the same, which doesn't make much sense. It seems to me we'd want something that captures just stack effect and cache size:
effects_table[instr.name] = len(instr.inputs), len(instr.outputs), cache_offset | |
stack_pre = sum(isinstance(item, StackEffect) for item in instr.inputs) | |
stack_post = sum(isinstance(item, StackEffect) for item in instr.inputs) | |
effects_table[instr.name] = stack_post - stack_pre, cache_offset |
(This will probably get more complicated as stack effects start getting more complicated...)
raise self.make_syntax_error( | ||
f"Input {name!r} at pos {i} repeated in output at different pos {j}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this bad? Seems useful for things like copies and swaps (unless the intention is to have the author assign the output to a new name anyways)?
Maybe it complicates refcounting somehow? Either way, might be worth a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I thought this was in Mark's DSL spec but I can't find it; I probably just misread something. Intuitively, this rules out cases like
inst(FOO, (left, right -- right)) {
DECREF(left);
}
which requires shifting right
down by one unit, and that seems a bit unexpected (could be caused by a typo?). But you're right, it doesn't cause any complications in the code generator, we'll just generate code like
{
PyObject *left = PEEK(2), *right = PEEK(1);
DECREF(left);
STACK_SHRINK(1);
POKE(1, right);
}
I'll get rid of this check in the next refactor (it's in the wrong place anyway).
|
||
def stack_effect(self) -> tuple[list[str], list[str]]: | ||
def stack_effect(self) -> tuple[list[Effect], list[Effect]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't look too closely at anything below this line (not a parsing expert). I'm sure it works fine, though. :)
while self.expect(lx.COMMA): | ||
if tkn := self.expect(lx.IDENTIFIER): | ||
members.append(tkn.text) | ||
else: | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this control flow easier to follow:
while self.expect(lx.COMMA): | |
if tkn := self.expect(lx.IDENTIFIER): | |
members.append(tkn.text) | |
else: | |
break | |
while self.expect(lx.COMMA) and (tkn := self.expect(lx.IDENTIFIER)): | |
members.append(tkn.text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I agree. Your rewrite is more compact but makes it easy to overlook that this code accepts a trailing comma. The longer form makes you pause and notice that.
if (tkn := self.expect(lx.IDENTIFIER)): | ||
if self.expect(lx.LBRACKET): | ||
if arg := self.expect(lx.IDENTIFIER): | ||
if self.expect(lx.RBRACKET): | ||
return f"{tkn.text}[{arg.text}]" | ||
if self.expect(lx.TIMES): | ||
if num := self.expect(lx.NUMBER): | ||
if self.expect(lx.RBRACKET): | ||
return f"{tkn.text}[{arg.text}*{num.text}]" | ||
raise self.make_syntax_error("Expected argument in brackets", tkn) | ||
|
||
return tkn.text | ||
if self.expect(lx.CONDOP): | ||
while self.expect(lx.CONDOP): | ||
pass | ||
return "??" | ||
return None | ||
if self.expect(lx.DIVIDE): | ||
if num := self.expect(lx.NUMBER): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that this file has pretty aggressive if
nesting. Out of curiousity, any reason why you prefer not to combine many if
s into one test? Maybe it fits your mental model of the parser better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latter, mostly. It makes it easier to add an else
clause later. Also, I like to test only one condition per line. The parser does need a bit of cleanup, but it's clean enough for now.
I apologize for the mess that generate_cases.py has become. I promise I will clean it up in the next PR.
This PR is a big step forwards though -- it supports cache effects and implements those for the BINARY_OP family (with one exception -- the "hemi-super-instruction"
BINARY_OP_INPLACE_ADD_UNICODE
). Check the generated code for the effects.PS. Merge conflicts for Python/bytecodes.c are quite painful, it seems there are several cooks in this kitchen. :-)