-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change shutil.rmtree and os.walk to support very deep hierarchies #89727
Comments
It is possible to create deep directory hierarchies that cannot be removed via shutil.rmtree or walked via os.walk, because these functions exceed the interpreter recursion limit. This may have security implications for web services (e.g. various webdisks) that have to clean up user-created mess or walk through it. [aep@aep-haswell ~]$ mkdir /tmp/badstuff
[aep@aep-haswell ~]$ cd /tmp/badstuff
[aep@aep-haswell badstuff]$ for x in `seq 2048` ; do mkdir $x ; cd $x ; done
[aep@aep-haswell 103]$ cd
[aep@aep-haswell ~]$ python
Python 3.9.7 (default, Oct 10 2021, 15:13:22)
[GCC 11.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import shutil
>>> shutil.rmtree('/tmp/badstuff')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.9/shutil.py", line 726, in rmtree
_rmtree_safe_fd(fd, path, onerror)
File "/usr/lib/python3.9/shutil.py", line 663, in _rmtree_safe_fd
_rmtree_safe_fd(dirfd, fullname, onerror)
File "/usr/lib/python3.9/shutil.py", line 663, in _rmtree_safe_fd
_rmtree_safe_fd(dirfd, fullname, onerror)
File "/usr/lib/python3.9/shutil.py", line 663, in _rmtree_safe_fd
_rmtree_safe_fd(dirfd, fullname, onerror)
[Previous line repeated 992 more times]
File "/usr/lib/python3.9/shutil.py", line 642, in _rmtree_safe_fd
fullname = os.path.join(path, entry.name)
File "/usr/lib/python3.9/posixpath.py", line 77, in join
sep = _get_sep(a)
File "/usr/lib/python3.9/posixpath.py", line 42, in _get_sep
if isinstance(path, bytes):
RecursionError: maximum recursion depth exceeded while calling a Python object
>>> import os
>>> list(os.walk('/tmp/badstuff'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.9/os.py", line 418, in _walk
yield from _walk(new_path, topdown, onerror, followlinks)
File "/usr/lib/python3.9/os.py", line 418, in _walk
yield from _walk(new_path, topdown, onerror, followlinks)
File "/usr/lib/python3.9/os.py", line 418, in _walk
yield from _walk(new_path, topdown, onerror, followlinks)
[Previous line repeated 993 more times]
File "/usr/lib/python3.9/os.py", line 412, in _walk
new_path = join(top, dirname)
File "/usr/lib/python3.9/posixpath.py", line 77, in join
sep = _get_sep(a)
File "/usr/lib/python3.9/posixpath.py", line 42, in _get_sep
if isinstance(path, bytes):
RecursionError: maximum recursion depth exceeded while calling a Python object
>>> |
Use a stack to implement os.walk iteratively instead of recursively to avoid hitting recursion limits on deeply nested trees.
Use a stack to implement os.walk iteratively instead of recursively to avoid hitting recursion limits on deeply nested trees.
This also affects |
Use a stack to implement os.walk iteratively instead of recursively to avoid hitting recursion limits on deeply nested trees.
Use a stack to implement os.walk iteratively instead of recursively to avoid hitting recursion limits on deeply nested trees.
…f github.com:ovsyanka83/cpython into pythongh-89727/fix-pathlib.Path.walk-recursion-depth
I am unsure on the bug vs feature classification here. On one side it seems like a clear problem and the fixes don’t change function signatures or the way they are used, but on the other side there is no guarantee that extremely deep hierachies are supported, and I wonder if there could be negative effects from the ocde changes (for example, can the (Adding PR reviewers: @carljm @brettcannon @serhiy-storchaka) |
I don't have strong feelings about calling it a bug vs a feature; I would tend to call it a bug because the function just fails in a subset of cases that aren't obviously outside its domain. I agree that it's a bug that we could just document as a known limitation, though I don't see any reason to do that when the fix is not that difficult. I guess the only real impact of how it's classified is whether the fix would be backported?
Sure, but in the current code the Python stack would instead grow very large in the same scenario. And the Python stack frames (plus everything they reference) are almost certainly larger than the stack elements tracked in the iterative version, so if anything I would expect the iterative version to also save memory in the case of a very deep traversal. |
Yes exactly, the impact of my question is backporting or not. I take it you feel ok about backporting the fix; let’s wait to see what a core dev thinks. |
I would classify this as a feature request. There are plenty of things in CPython for which you can you run out of stack space for and it isn't considered a bug in those instances (and that's just part of general limitations that CPython has). |
(To be clear, I don't think there's a strong case for backporting this, so reasoning backwards from the conclusion I'm quite happy to call it a feature :P ) |
Are |
Yes, it looks like it. |
Follow-up to 3c890b5. Ensure we `os.close()` open file descriptors when the `os.fwalk()` generator is finalized.
…ythonGH-119766) Follow-up to 3c890b5. Ensure we `os.close()` open file descriptors when the `os.fwalk()` generator is finalized. (cherry picked from commit a5fef80) Co-authored-by: Barney Gale <barney.gale@gmail.com>
…ythonGH-119766) Follow-up to 3c890b5. Ensure we `os.close()` open file descriptors when the `os.fwalk()` generator is finalized. (cherry picked from commit a5fef80) Co-authored-by: Barney Gale <barney.gale@gmail.com>
One more for the pile: |
Implement `shutil._rmtree_safe_fd()` using a list as a stack to avoid emitting recursion errors on deeply nested trees. `shutil._rmtree_unsafe()` was fixed in a150679.
…ythonGH-119808) Implement `shutil._rmtree_safe_fd()` using a list as a stack to avoid emitting recursion errors on deeply nested trees. `shutil._rmtree_unsafe()` was fixed in a150679. (cherry picked from commit 53b1981) Co-authored-by: Barney Gale <barney.gale@gmail.com>
…ython#119808) Implement `shutil._rmtree_safe_fd()` using a list as a stack to avoid emitting recursion errors on deeply nested trees. `shutil._rmtree_unsafe()` was fixed in a150679. (cherry picked from commit 53b1981)
…trees (pythonGH-119808) Implement `shutil._rmtree_safe_fd()` using a list as a stack to avoid emitting recursion errors on deeply nested trees. `shutil._rmtree_unsafe()` was fixed in a150679.. (cherry picked from commit 53b1981) Co-authored-by: Barney Gale <barney.gale@gmail.com>
…ython#119808) Implement `shutil._rmtree_safe_fd()` using a list as a stack to avoid emitting recursion errors on deeply nested trees. `shutil._rmtree_unsafe()` was fixed in a150679.
…ep trees (python#119634) Make `shutil._rmtree_unsafe()` call `os.walk()`, which is implemented without recursion. `shutil._rmtree_safe_fd()` is not affected and can still raise a recursion error. Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
…n#119638) Implement `os.fwalk()` using a list as a stack to avoid emitting recursion errors on deeply nested trees.
…ython#119766) Follow-up to 3c890b5. Ensure we `os.close()` open file descriptors when the `os.fwalk()` generator is finalized.
…ython#119808) Implement `shutil._rmtree_safe_fd()` using a list as a stack to avoid emitting recursion errors on deeply nested trees. `shutil._rmtree_unsafe()` was fixed in a150679.
…ep trees (python#119634) Make `shutil._rmtree_unsafe()` call `os.walk()`, which is implemented without recursion. `shutil._rmtree_safe_fd()` is not affected and can still raise a recursion error. Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
…n#119638) Implement `os.fwalk()` using a list as a stack to avoid emitting recursion errors on deeply nested trees.
…ython#119766) Follow-up to 3c890b5. Ensure we `os.close()` open file descriptors when the `os.fwalk()` generator is finalized.
…ython#119808) Implement `shutil._rmtree_safe_fd()` using a list as a stack to avoid emitting recursion errors on deeply nested trees. `shutil._rmtree_unsafe()` was fixed in a150679.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
shutil.rmtree()
recursion error on deep trees. #103164pathlib.Path.fwalk()
method #103566shutil.rmtree()
recursion error on deep trees #119634os.fwalk()
recursion error on deep trees #119638shutil.rmtree()
recursion error on deep trees (GH-119634) #119748shutil.rmtree()
recursion error on deep trees (GH-119634) #119749os.fwalk()
recursion error on deep trees (GH-119638) #119764os.fwalk()
recursion error on deep trees (GH-119638) #119765os.fwalk()
generator finalization. #119766os.fwalk()
generator finalization. (GH-119766) #119767os.fwalk()
generator finalization. (GH-119766) #119768shutil.rmtree()
recursion error on deep trees #119808shutil.rmtree()
recursion error on deep trees (GH-119808) #119918shutil.rmtree()
recursion error on deep trees (GH-119808) #119919The text was updated successfully, but these errors were encountered: