Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RecursionError during copy.deepcopy of an ast-tree with parent references #120108

Closed
15r10nk opened this issue Jun 5, 2024 · 10 comments
Closed
Labels
3.13 bugs and security fixes 3.14 new features, bugs and security fixes type-bug An unexpected behavior, bug, or error

Comments

@15r10nk
Copy link
Contributor

15r10nk commented Jun 5, 2024

Bug report

Bug description:

python 3.13 raises an RecursionError when I try to deepcopy the ast in the following example.

import ast
import copy

code="""
('',)
while i < n:
    if ch == '':
        ch = format[i]
        if ch == '':
            if freplace is None:
                '' % getattr(object)
        elif ch == '':
            if zreplace is None:
                if hasattr:
                    offset = object.utcoffset()
                    if offset is not None:
                        if offset.days < 0:
                            offset = -offset
                        h = divmod(timedelta(hours=0))
                        if u:
                            zreplace = '' % (sign,)
                        elif s:
                            zreplace = '' % (sign,)
                        else:
                            zreplace = '' % (sign,)
        elif ch == '':
            if Zreplace is None:
                Zreplace = ''
                if hasattr(object):
                    s = object.tzname()
                    if s is not None:
                        Zreplace = s.replace('')
            newformat.append(Zreplace)
        else:
            push('')
    else:
        push(ch)

"""



tree=ast.parse(code)

# add a back reference to the parent node
for node in ast.walk(tree):
    for child in ast.iter_child_nodes(node):
        child.parent=node

tree2=copy.deepcopy(tree)

output (Python 3.13.0b1+):

Traceback (most recent call last):
  File "/home/frank/projects/cpython/../cpython_bugs/deep_copy_ast_with_parents.py", line 50, in <module>
    tree2=copy.deepcopy(tree)
  File "/home/frank/projects/cpython/Lib/copy.py", line 162, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/home/frank/projects/cpython/Lib/copy.py", line 253, in _reconstruct
    y = func(*args)
  File "/home/frank/projects/cpython/Lib/copy.py", line 252, in <genexpr>
    args = (deepcopy(arg, memo) for arg in args)
            ~~~~~~~~^^^^^^^^^^^
  File "/home/frank/projects/cpython/Lib/copy.py", line 136, in deepcopy
    y = copier(x, memo)
  File "/home/frank/projects/cpython/Lib/copy.py", line 196, in _deepcopy_list
    append(deepcopy(a, memo))
           ~~~~~~~~^^^^^^^^^
...

  File "/home/frank/projects/cpython/Lib/copy.py", line 162, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/home/frank/projects/cpython/Lib/copy.py", line 259, in _reconstruct
    state = deepcopy(state, memo)
  File "/home/frank/projects/cpython/Lib/copy.py", line 136, in deepcopy
    y = copier(x, memo)
  File "/home/frank/projects/cpython/Lib/copy.py", line 221, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
                             ~~~~~~~~^^^^^^^^^^^^^
  File "/home/frank/projects/cpython/Lib/copy.py", line 162, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/home/frank/projects/cpython/Lib/copy.py", line 253, in _reconstruct
    y = func(*args)
  File "/home/frank/projects/cpython/Lib/copy.py", line 252, in <genexpr>
    args = (deepcopy(arg, memo) for arg in args)
            ~~~~~~~~^^^^^^^^^^^
  File "/home/frank/projects/cpython/Lib/copy.py", line 136, in deepcopy
    y = copier(x, memo)
RecursionError: maximum recursion depth exceeded

The problem can be reproduced on the current main (5c02ea8)

I was able to bisect the problem down to ed4dfd8 (@JelleZijlstra may be you know more here)

The code in the example looks big but is already minimized.

I hope that this example is simple enough to find and fix the bug, I can try to find a smaller one if it is needed.

CPython versions tested on:

3.13, CPython main branch

Operating systems tested on:

No response

Linked PRs

@15r10nk 15r10nk added the type-bug An unexpected behavior, bug, or error label Jun 5, 2024
@15r10nk
Copy link
Contributor Author

15r10nk commented Jun 5, 2024

and I forgot to mention that it worked on 3.12.

@picnixz
Copy link
Member

picnixz commented Jun 5, 2024

You can increase the recursion limit using sys.setrecursionlimit(1500) (but we can check if there are specific reasons why those operations triggered the recursion limit).

@AlexWaygood AlexWaygood added 3.13 bugs and security fixes 3.14 new features, bugs and security fixes labels Jun 5, 2024
@JelleZijlstra
Copy link
Member

The sample succeeds for me on main if I setrecursionlimit to at least 1006, but on 3.12 it only needs somewhere between 125 and 150 (didn't bother to finish bisecting). So at least there's no infinite recursion, but I'll look into why we need so many more stack frames now.

@15r10nk
Copy link
Contributor Author

15r10nk commented Jun 5, 2024

It looks like it depends on the with of the tree and not just the depth (because I would end up with deep, but a not so wide ast tree otherwise).
The recursion might go into the parent and maybe into the childs of the parent.

@JelleZijlstra
Copy link
Member

Yes, I think when it tries to copy each .parent node, that adds a lot more recursion.

I think this is related:

$ python3.12
Python 3.12.2 (main, May  6 2024, 20:50:52) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import ast
>>> n=ast.Name("x")
>>> n.parent = 42
>>> n.__reduce_ex__(4)
(<class 'ast.Name'>, (), {'id': 'x', 'parent': 42})
>>> 
$ ./python.exe 
Python 3.14.0a0 (heads/main:4bba1c9e6cf, Jun  5 2024, 06:44:33) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import ast
>>> n=ast.Name("x")
>>> n.parent = 422
>>> n.__reduce_ex__(4)
(<class 'ast.Name'>, ('x', <ast.Load object at 0x1039488d0>), {'parent': 422})
>>> 

The problem is that the child fields are now in the args rather than the state, and copy._reconstruct first copies the args, then sets the object in the memo, and only then copies the state. But while copying each of the args, we have to copy its .parent, and that has to crawl back up to the parent—which we're currently in the process of copying.

I can't think of a way to fix this in copy.py. I can think of a way to fix it on the ast side, though it's not very elegant: make __reduce_ex__ return something like (ast.Name, (None, None), {"id": "x", "ctx": ast.Load(), "parent": 422}). In other words, we initially set all the fields to None, and then update them to the right values later.

@JelleZijlstra
Copy link
Member

Here's another fun fact: the expr_context objects returned from ast.parse are shared, so if you mutate the ones that get returned from one ast.parse call, you affect the results of other calls:

>>> import ast
>>> tree1=ast.parse("x")
>>> tree1.body[0].value.ctx.whatever = "hi"
>>> tree2 = ast.parse("y")
>>> tree2.body[0].value.ctx.whatever
'hi'

This is arguably a bug but I'm not sure what to do about it, other than to stop sharing the expr_context objects.

@picnixz
Copy link
Member

picnixz commented Jun 5, 2024

I think you should definitely stop sharing the expr_context (so that if I need the ctx in the future in Sphinx, I don't have surprises!)

@15r10nk
Copy link
Contributor Author

15r10nk commented Jun 5, 2024

the expr_context objects returned from ast.parse are shared

Yes, I figured this also out some time ago, and wrote some work around for it (to make the parent connection work).
It looked for me like some kind of optimization. I would be ok if it gets changed/fixed.

@AlexWaygood
Copy link
Member

@serhiy-storchaka suggested in #118855 (comment) to make ast.Load and ast.Store (immutable?) singleton objects instead, which might solve that issue by going in the opposite direction

@JelleZijlstra
Copy link
Member

Making them immutable would break compatibility, unfortunately. In any case, that's off topic for this issue. I'm about to send a PR fixing this bug.

JelleZijlstra added a commit to JelleZijlstra/cpython that referenced this issue Jun 5, 2024
JelleZijlstra added a commit to JelleZijlstra/cpython that referenced this issue Jun 25, 2024
…ributes (pythonGH-120114)

(cherry picked from commit 42b2c9d)

Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
JelleZijlstra added a commit to JelleZijlstra/cpython that referenced this issue Jun 25, 2024
…ributes (pythonGH-120114)

(cherry picked from commit 42b2c9d)

Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes 3.14 new features, bugs and security fixes type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants