-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
virtualenv generates python that pre-imports threading (16.x -> 20.x regression) #1895
Comments
Earlier patchings broke other workflows while also imposing significant maintenance overheads, so that path is a no go from me.
Wait, this is odd, how come gpython must patch at first import. IMHO this requirement doesn't make much sense, and I'd advocate for dropping this for gpython. Alternatively, we could disable distutils patching with gpython, given we can somehow identify that gpython is active 🤷 |
@gaborbernat thanks for feedback. Gevent-based applications need to patch threading, socket, ssl, etc modules in order to change standard library to use lightweight greenlets instead of OS-threads. This patching has to be performed as early as possible in the lifetime of the process in order to avoid situation when part of the program uses functions that are greenlet-aware, and part of the program uses "the same function" - e.g. for http://www.gevent.org/intro.html#monkey-patching Note: it is not only gpython who needs this. It is all gevent-based programs that use gevent via stdlib-compatible API. And there are many such programs. Gevent also provides a way to run programs simalar to gpython - via Maybe if there would be a mechanism for gpython/gevent to tell to virtualenv to run part of its code as early as possible - in particular before virtualenv imports threading - that would indeed help. Kirill |
Thinking this through I think would solve this use case and avoid the problem if we'd construct the lock at first call within find_spec, rather than upfront. At which point distutils/setuptools gets called event should already be setup (potentially either via a pth file, or site customize). |
@gaborbernat, thanks. Do you mean something like this: diff --git a/src/virtualenv/create/via_global_ref/_virtualenv.py b/src/virtualenv/create/via_global_ref/_virtualenv.py
index b399da4..9a8e7c9 100644
--- a/src/virtualenv/create/via_global_ref/_virtualenv.py
+++ b/src/virtualenv/create/via_global_ref/_virtualenv.py
@@ -39,18 +39,35 @@ def parse_config_files(self, *args, **kwargs):
# https://docs.python.org/3/library/importlib.html#setting-up-an-importer
from importlib.abc import MetaPathFinder
from importlib.util import find_spec
- from threading import Lock
from functools import partial
class _Finder(MetaPathFinder):
"""A meta path finder that allows patching the imported distutils modules"""
fullname = None
- lock = Lock()
+
+ # lock[0] is threading.Lock(), but initialized lazily to avoid importing
+ # threading very early at startup, because there are gevent-based
+ # applications that need to be first to import threading by themselves.
+ # See /~https://github.com/pypa/virtualenv/issues/1895 for details.
+ lock = []
def find_spec(self, fullname, path, target=None):
if fullname in _DISTUTILS_PATCH and self.fullname is None:
- with self.lock:
+ # initialize lock[0] lazily
+ if len(self.lock) == 0:
+ import threading
+ lock = threading.Lock()
+ # there is possibility that two threads T1 and T2 are
+ # simultaneously running into find_spec, observing .lock as
+ # empty, and further going into hereby initialization.
+ # However due to the GIL, list.append() operation is atomic
+ # and this way only one of the threads will "win" to put
+ # the lock - that every thread will use - into .lock[0].
+ # https://docs.python.org/3/faq/library.html#what-kinds-of-global-value-mutation-are-thread-safe
+ self.lock.append(lock)
+
+ with self.lock[0]:
self.fullname = fullname
try:
spec = find_spec(fullname, path) ? I don't have offhand experience with virtualenv codebase on where and how to put the test that after python startup, Kirill |
Gevent-based applications need to be able to import threading, socket, ssl, etc... modules early to be able to monkey-patch them to use lightweight greenlets instead of OS threads. This patching has to be done as early as possible in the lifetime of the process http://www.gevent.org/intro.html#monkey-patching However starting from caae48b (use import hooks to patch distutils/setuptools (pypa#1688)) threading became always preimported which, for example, rendered gpython[1] unusable: (py38.virtualenv) kirr@deco:~/tmp/trashme$ gpython Traceback (most recent call last): File "/home/kirr/tmp/trashme/py38.virtualenv/bin/gpython", line 8, in <module> sys.exit(main()) File "/home/kirr/tmp/trashme/py38.virtualenv/lib/python3.8/site-packages/gpython/__init__.py", line 151, in main raise RuntimeError('gpython: internal error: the following modules are pre-imported, but must be not:' RuntimeError: gpython: internal error: the following modules are pre-imported, but must be not: ['threading'] sys.modules: ['__future__', '__main__', '_abc', '_bootlocale', '_codecs', '_collections', '_collections_abc', '_frozen_importlib', '_frozen_importlib_external', '_functools', '_heapq', '_imp', '_io', '_locale', '_operator', '_signal', '_sitebuiltins', '_sre', '_stat', '_thread', '_virtualenv', '_warnings', '_weakref', '_weakrefset', 'abc', 'builtins', 'codecs', 'collections', 'contextlib', 'copyreg', 'encodings', 'encodings.aliases', 'encodings.latin_1', 'encodings.utf_8', 'enum', 'functools', 'genericpath', 'gpython', 'heapq', 'importlib', 'importlib._bootstrap', 'importlib._bootstrap_external', 'importlib.abc', 'importlib.machinery', 'importlib.util', 'io', 'itertools', 'keyword', 'marshal', 'operator', 'os', 'os.path', 'posix', 'posixpath', 're', 'reprlib', 'site', 'sitecustomize', 'sre_compile', 'sre_constants', 'sre_parse', 'stat', 'sys', 'threading', 'time', 'types', 'warnings', 'zipimport', 'zope'] Fix it by importing threading lazily. Fixes: pypa#1895 [1] https://pypi.org/project/pygolang/#gpython
Gevent-based applications need to be able to import threading, socket, ssl, etc... modules early to be able to monkey-patch them to use lightweight greenlets instead of OS threads. This patching has to be done as early as possible in the lifetime of the process http://www.gevent.org/intro.html#monkey-patching However starting from caae48b (use import hooks to patch distutils/setuptools (pypa#1688)) threading became always preimported which, for example, rendered gpython[1] unusable: (py38.virtualenv) kirr@deco:~/tmp/trashme$ gpython Traceback (most recent call last): File "/home/kirr/tmp/trashme/py38.virtualenv/bin/gpython", line 8, in <module> sys.exit(main()) File "/home/kirr/tmp/trashme/py38.virtualenv/lib/python3.8/site-packages/gpython/__init__.py", line 151, in main raise RuntimeError('gpython: internal error: the following modules are pre-imported, but must be not:' RuntimeError: gpython: internal error: the following modules are pre-imported, but must be not: ['threading'] sys.modules: ['__future__', '__main__', '_abc', '_bootlocale', '_codecs', '_collections', '_collections_abc', '_frozen_importlib', '_frozen_importlib_external', '_functools', '_heapq', '_imp', '_io', '_locale', '_operator', '_signal', '_sitebuiltins', '_sre', '_stat', '_thread', '_virtualenv', '_warnings', '_weakref', '_weakrefset', 'abc', 'builtins', 'codecs', 'collections', 'contextlib', 'copyreg', 'encodings', 'encodings.aliases', 'encodings.latin_1', 'encodings.utf_8', 'enum', 'functools', 'genericpath', 'gpython', 'heapq', 'importlib', 'importlib._bootstrap', 'importlib._bootstrap_external', 'importlib.abc', 'importlib.machinery', 'importlib.util', 'io', 'itertools', 'keyword', 'marshal', 'operator', 'os', 'os.path', 'posix', 'posixpath', 're', 'reprlib', 'site', 'sitecustomize', 'sre_compile', 'sre_constants', 'sre_parse', 'stat', 'sys', 'threading', 'time', 'types', 'warnings', 'zipimport', 'zope'] Fix it by importing threading lazily. Fixes: pypa#1895 [1] https://pypi.org/project/pygolang/#gpython
OK, I've filed #1897. |
Issue
Hello up there.
After vitualenv upgrade from 16.x to 20.x I've started to get failures with gpython which complains that
threading
module is already imported at python startup time. It used to work OK before virtualenv rewrite and it still works OK withpython -m venv
. Virtualenv runtime startup dependency on threading is here added by caae48b ("use import hooks to patch distutils/setuptools (#1688)").While it is possible to mechanically change
threading
to_thread.allocate_lock
to make gpython nominally happy, in practice it would be more correct to avoid importing and using boththreading
and low-level_thread
, because gevent-based applications depend on being able to patch those modules as early as possible (/cc @jamadden). Before caae48b distutils was patched without importing thread modules, so maybe it is possible to restore that property.Please find details below.
Thanks beforehand,
Kirill
(full output from virtualenv creation)
For the record here is how it works with older virtualenv and
python -m venv
:Environment
Provide at least:
Linux deco 5.7.0-1-amd64 #1 SMP Debian 5.7.6-1 (2020-06-24) x86_64 GNU/Linux
pip list
of the host python wherevirtualenv
is installed:The text was updated successfully, but these errors were encountered: