GH-98831: Implement basic cache effects #99313

gvanrossum · 2022-11-10T06:16:54Z

I apologize for the mess that generate_cases.py has become. I promise I will clean it up in the next PR.

This PR is a big step forwards though -- it supports cache effects and implements those for the BINARY_OP family (with one exception -- the "hemi-super-instruction" BINARY_OP_INPLACE_ADD_UNICODE). Check the generated code for the effects.

PS. Merge conflicts for Python/bytecodes.c are quite painful, it seems there are several cooks in this kitchen. :-)

Issue: Generate the interpreter #98831

Had to refactor the parser a bit for this.

Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and Py_XNewRef() in C files of the Objects/ directory.

Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and Py_XNewRef() in Objects/dictobject.c.

…thon#99280) Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>

… venvs (pythonGH-99206) Check to see if `base_executable` exists. If it does not, attempt to use known alternative names of the python binary to find an executable in the path specified by `home`. If no alternative is found, previous behavior is preserved. Signed-off-by: Vincent Fazio <vfazio@gmail.com> Signed-off-by: Vincent Fazio <vfazio@gmail.com>

…9299)

python#99271) Also mark those opcodes that have no stack effect as such. Co-authored-by: Brandt Bucher <brandtbucher@gmail.com>

gvanrossum · 2022-11-11T03:48:42Z

PS. I didn't implement a family that actually uses the cache (the 'counter' doesn't count, it's special since it is written, which our DSL doesn't support). But I figured I'd stop here -- keeping these PRs open for a long time is hard work due to merge conflicts.

gvanrossum · 2022-11-13T21:40:38Z

PS. I didn't implement a family that actually uses the cache (the 'counter' doesn't count, it's special since it is written to, which our DSL doesn't support). But I figured I'd stop here -- keeping these PRs open for a long time is hard work due to merge conflicts.

Working on the refactor I now know for sure there are some bugs in that part. (EDIT: Fixed in GH-99408 but not here.)

markshannon

Maybe leave checking of families to another PR, and just implement stack effects in this PR?

Python/bytecodes.c

Tools/cases_generator/generate_cases.py

bedevere-bot · 2022-11-15T10:28:11Z

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

gvanrossum · 2022-11-15T15:59:51Z

I have made the requested changes; please review again

(Well, I've answered everything and would like to merge this as-is so Brandt can continue on GH-99399.)

bedevere-bot · 2022-11-15T15:59:55Z

Thanks for making the requested changes!

@markshannon: please review the changes made to this pull request.

brandtbucher

Looks good, thanks!

I've sprinked a bunch of random notes and questions throughout. Feel free to fix now, later, or never. :)

brandtbucher · 2022-11-16T01:30:04Z

Tools/cases_generator/parser.py

-    def outputs(self):
-        return self.header.outputs
+    def outputs(self) -> list[StackEffect]:
+        # This is always true


What's always true?

isinstance(x, StackEffect). It's gone in the next refactor.

brandtbucher · 2022-11-16T01:35:06Z

Tools/cases_generator/generate_cases.py

+    for ceffect in cache:
+        if ceffect.name != "unused":
+            bits = ceffect.size * 16
+            f.write(f"{indent}    PyObject *{ceffect.name} = read{bits}(next_instr + {cache_offset});\n")


Not that it matters yet, but these are almost always fixed-width integer types, not objects (though we'll eventually want handling for objects too):

Suggested change

f.write(f"{indent} PyObject *{ceffect.name} = read{bits}(next_instr + {cache_offset});\n")

f.write(f"{indent} uint{bits}_t {ceffect.name} = read{bits}(next_instr + {cache_offset});\n")

brandtbucher · 2022-11-16T01:40:22Z

Python/bytecodes.c

+        };
+
+
+        inst(BINARY_OP_MULTIPLY_INT, (left, right, unused/1 -- prod)) {


Just my preference, but I sort of prefer a name like _ to a name like unused for our syntax here. I guess it just feels more "special", and doesn't distract:

Suggested change

inst(BINARY_OP_MULTIPLY_INT, (left, right, unused/1 -- prod)) {

inst(BINARY_OP_MULTIPLY_INT, (left, right, _/1 -- prod)) {

brandtbucher · 2022-11-16T01:44:40Z

Python/bytecodes.c

@@ -193,7 +191,21 @@ dummy_func(
            ERROR_IF(res == NULL, error);
        }

-        inst(BINARY_OP_MULTIPLY_INT, (left, right -- prod)) {
+        family(binary_op, INLINE_CACHE_ENTRIES_BINARY_OP) = {


Honestly, I don't think we really need the INLINE_CACHE_ENTRIES_WHATEVER stuff (or the asserts it produces) in this file anymore (they were originally added to simplify the JUMPBY(...) moves, but those are going to be generated now).

I feel like it sort of just complicates parsing and code generation for no real benefit... plus, we actually already have asserts to this effect in specialize.c where we ended up re-using these constants in some places.

Suggested change

family(binary_op, INLINE_CACHE_ENTRIES_BINARY_OP) = {

family(binary_op) = {

brandtbucher · 2022-11-16T01:45:32Z

Python/bytecodes.c

 static PyObject *value, *value1, *value2, *left, *right, *res, *sum, *prod, *sub;
-static PyObject *container, *start, *stop, *v;
+static PyObject *container, *start, *stop, *v, *lhs, *rhs;


brandtbucher · 2022-11-16T01:58:37Z

Tools/cases_generator/generate_cases.py

+            instr, predictions, indent, f,
+            cache_size=find_cache_size(instr, families)
+        )
+        effects_table[instr.name] = len(instr.inputs), len(instr.outputs), cache_offset


Hm. This is a bit weird because it treats caches and stack items the same, which doesn't make much sense. It seems to me we'd want something that captures just stack effect and cache size:

Suggested change

effects_table[instr.name] = len(instr.inputs), len(instr.outputs), cache_offset

stack_pre = sum(isinstance(item, StackEffect) for item in instr.inputs)

stack_post = sum(isinstance(item, StackEffect) for item in instr.inputs)

effects_table[instr.name] = stack_post - stack_pre, cache_offset

(This will probably get more complicated as stack effects start getting more complicated...)

brandtbucher · 2022-11-16T02:03:21Z

Tools/cases_generator/parser.py

+                        raise self.make_syntax_error(
+                            f"Input {name!r} at pos {i} repeated in output at different pos {j}")


Why is this bad? Seems useful for things like copies and swaps (unless the intention is to have the author assign the output to a new name anyways)?

Maybe it complicates refcounting somehow? Either way, might be worth a comment.

Good question. I thought this was in Mark's DSL spec but I can't find it; I probably just misread something. Intuitively, this rules out cases like

inst(FOO, (left, right -- right)) { DECREF(left); }

which requires shifting right down by one unit, and that seems a bit unexpected (could be caused by a typo?). But you're right, it doesn't cause any complications in the code generator, we'll just generate code like

{ PyObject *left = PEEK(2), *right = PEEK(1); DECREF(left); STACK_SHRINK(1); POKE(1, right); }

I'll get rid of this check in the next refactor (it's in the wrong place anyway).

brandtbucher · 2022-11-16T02:05:33Z

Tools/cases_generator/parser.py


-    def stack_effect(self) -> tuple[list[str], list[str]]:
+    def stack_effect(self) -> tuple[list[Effect], list[Effect]]:


I didn't look too closely at anything below this line (not a parsing expert). I'm sure it works fine, though. :)

brandtbucher · 2022-11-16T02:07:11Z

Tools/cases_generator/parser.py

+            while self.expect(lx.COMMA):
+                if tkn := self.expect(lx.IDENTIFIER):
+                    members.append(tkn.text)
+                else:
+                    break


I find this control flow easier to follow:

Suggested change

while self.expect(lx.COMMA):

if tkn := self.expect(lx.IDENTIFIER):

members.append(tkn.text)

else:

break

while self.expect(lx.COMMA) and (tkn := self.expect(lx.IDENTIFIER)):

members.append(tkn.text)

I'm not sure I agree. Your rewrite is more compact but makes it easy to overlook that this code accepts a trailing comma. The longer form makes you pause and notice that.

brandtbucher · 2022-11-16T02:08:51Z

Tools/cases_generator/parser.py

        if (tkn := self.expect(lx.IDENTIFIER)):
-            if self.expect(lx.LBRACKET):
-                if arg := self.expect(lx.IDENTIFIER):
-                    if self.expect(lx.RBRACKET):
-                        return f"{tkn.text}[{arg.text}]"
-                    if self.expect(lx.TIMES):
-                        if num := self.expect(lx.NUMBER):
-                            if self.expect(lx.RBRACKET):
-                                return f"{tkn.text}[{arg.text}*{num.text}]"
-                raise self.make_syntax_error("Expected argument in brackets", tkn)
-
-            return tkn.text
-        if self.expect(lx.CONDOP):
-            while self.expect(lx.CONDOP):
-                pass
-            return "??"
-        return None
+            if self.expect(lx.DIVIDE):
+                if num := self.expect(lx.NUMBER):


I noticed that this file has pretty aggressive if nesting. Out of curiousity, any reason why you prefer not to combine many ifs into one test? Maybe it fits your mental model of the parser better?

The latter, mostly. It makes it easier to add an else clause later. Also, I like to test only one condition per line. The parser does need a bit of cleanup, but it's clean enough for now.

gvanrossum added 2 commits November 9, 2022 18:38

Support simple cache effects

9f15c4b

Had to refactor the parser a bit for this.

More BINARY_OP instructions

6189043

bedevere-bot mentioned this pull request Nov 10, 2022

Generate the interpreter #98831

Closed

bedevere-bot added the awaiting core review label Nov 10, 2022

gvanrossum added the skip news label Nov 10, 2022

gvanrossum and others added 20 commits November 10, 2022 07:27

Merge remote-tracking branch 'origin/main' into cache-effects

f5e1aed

Tweak dummy definitions in bytecodes.c after merge

a8d608d

pythongh-99300: Use Py_NewRef() in Objects/ directory (python#99332)

4ee85e7

Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and Py_XNewRef() in C files of the Objects/ directory.

pythongh-99300: Use Py_NewRef() in Objects/dictobject.c (python#99333)

873da31

Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and Py_XNewRef() in Objects/dictobject.c.

pythongh-90110: Update the C-analyzer Tool (pythongh-99307)

e0ab5b8

pythongh-99277: remove older version of get_write_buffer_limits (py…

882fdec

…thon#99280) Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>

pythonGH-99298: Don't perform jumps before error handling (pythonGH-9…

1aa0124

…9299)

pythonGH-98831: Remove all remaining DISPATCH() calls from bytecodes.c (

d094e42

python#99271) Also mark those opcodes that have no stack effect as such. Co-authored-by: Brandt Bucher <brandtbucher@gmail.com>

Remaining BINARY_OP family members

d3d907a

Uniformly skip 'unused' effects

0339a67

Remove superfluous asserts; fix one 'is not'

f3e7dd6

Make BINARY_OP result unused

e3ff6ac

Fix parser for family()

3db443a

Check family consistency

c58a85a

Add first family (binary_op)

756a41b

Add assert() to double-check cache struct size

48400ac

Merge commit '00ee6d506e' into cache-effects

433243a

Merge commit '694cdb24a6' into cache-effects

3d51484

Merge remote-tracking branch 'origin/main' into cache-effects

4d42a0a

gvanrossum marked this pull request as ready for review November 11, 2022 01:20

gvanrossum requested review from brandtbucher and markshannon November 11, 2022 02:34

brandtbucher mentioned this pull request Nov 11, 2022

GH-98686: Get rid of BINARY_OP_GENERIC and COMPARE_OP_GENERIC #99399

Merged

gvanrossum mentioned this pull request Nov 12, 2022

GH-98831: Refactor generate_cases.py #99408

Closed

markshannon requested changes Nov 15, 2022

View reviewed changes

Python/bytecodes.c Show resolved Hide resolved

Python/bytecodes.c Show resolved Hide resolved

Python/bytecodes.c Show resolved Hide resolved

Tools/cases_generator/generate_cases.py Show resolved Hide resolved

bedevere-bot added awaiting changes and removed awaiting core review labels Nov 15, 2022

bedevere-bot added awaiting change review and removed awaiting changes labels Nov 15, 2022

bedevere-bot requested a review from markshannon November 15, 2022 15:59

brandtbucher approved these changes Nov 16, 2022

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting change review labels Nov 16, 2022

gvanrossum merged commit e37744f into python:main Nov 16, 2022

bedevere-bot removed the awaiting merge label Nov 16, 2022

gvanrossum deleted the cache-effects branch December 8, 2022 22:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-98831: Implement basic cache effects #99313

GH-98831: Implement basic cache effects #99313

gvanrossum commented Nov 10, 2022 •

edited

Loading

gvanrossum commented Nov 11, 2022

gvanrossum commented Nov 13, 2022 •

edited

Loading

markshannon left a comment

bedevere-bot commented Nov 15, 2022

gvanrossum commented Nov 15, 2022

bedevere-bot commented Nov 15, 2022

brandtbucher left a comment

brandtbucher Nov 16, 2022

gvanrossum Nov 16, 2022

brandtbucher Nov 16, 2022

brandtbucher Nov 16, 2022

brandtbucher Nov 16, 2022

brandtbucher Nov 16, 2022

brandtbucher Nov 16, 2022

brandtbucher Nov 16, 2022 •

edited

Loading

gvanrossum Nov 16, 2022

brandtbucher Nov 16, 2022

brandtbucher Nov 16, 2022

gvanrossum Nov 16, 2022

brandtbucher Nov 16, 2022

gvanrossum Nov 16, 2022

	f.write(f"{indent} PyObject *{ceffect.name} = read{bits}(next_instr + {cache_offset});\n")
	f.write(f"{indent} uint{bits}_t {ceffect.name} = read{bits}(next_instr + {cache_offset});\n")

		};


		inst(BINARY_OP_MULTIPLY_INT, (left, right, unused/1 -- prod)) {

	family(binary_op, INLINE_CACHE_ENTRIES_BINARY_OP) = {
	family(binary_op) = {

-        effects_table[instr.name] = len(instr.inputs), len(instr.outputs), cache_offset
+        stack_pre = sum(isinstance(item, StackEffect) for item in instr.inputs)
+        stack_post = sum(isinstance(item, StackEffect) for item in instr.inputs)
+        effects_table[instr.name] = stack_post - stack_pre, cache_offset

		raise self.make_syntax_error(
		f"Input {name!r} at pos {i} repeated in output at different pos {j}")


		def stack_effect(self) -> tuple[list[str], list[str]]:
		def stack_effect(self) -> tuple[list[Effect], list[Effect]]:

GH-98831: Implement basic cache effects #99313

GH-98831: Implement basic cache effects #99313

Conversation

gvanrossum commented Nov 10, 2022 • edited Loading

gvanrossum commented Nov 11, 2022

gvanrossum commented Nov 13, 2022 • edited Loading

markshannon left a comment

Choose a reason for hiding this comment

bedevere-bot commented Nov 15, 2022

gvanrossum commented Nov 15, 2022

bedevere-bot commented Nov 15, 2022

brandtbucher left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brandtbucher Nov 16, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gvanrossum commented Nov 10, 2022 •

edited

Loading

gvanrossum commented Nov 13, 2022 •

edited

Loading

brandtbucher Nov 16, 2022 •

edited

Loading