-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cross compiles try to load libraries for target python when the version + SOABI combo match the host and can crash #115382
Comments
@tiran Apologies in advance for tagging you, but would like your input as the developer who introduced the script as a dependency to the default build target. |
I forgot to actually run autoreconf after patching configure.ac so the updated PYTHONPATH didn't actually stick. Builds will definitely fail as part of Traceback (most recent call last):
File "/work/.cache/build/rpi3/python/3.12.2/python-3.12.2/Tools/build/check_extension_modules.py", line 484, in <module>
main()
File "/work/.cache/build/rpi3/python/3.12.2/python-3.12.2/Tools/build/check_extension_modules.py", line 466, in main
checker = ModuleChecker(
^^^^^^^^^^^^^^
File "/work/.cache/build/rpi3/python/3.12.2/python-3.12.2/Tools/build/check_extension_modules.py", line 143, in __init__
self.ext_suffix = sysconfig.get_config_var("EXT_SUFFIX")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/work/.cache/build/rpi3/python/3.12.2/python-3.12.2/Lib/sysconfig.py", line 740, in get_config_var
return get_config_vars().get(name)
^^^^^^^^^^^^^^^^^
File "/work/.cache/build/rpi3/python/3.12.2/python-3.12.2/Lib/sysconfig.py", line 723, in get_config_vars
_init_config_vars()
File "/work/.cache/build/rpi3/python/3.12.2/python-3.12.2/Lib/sysconfig.py", line 670, in _init_config_vars
_init_posix(_CONFIG_VARS)
File "/work/.cache/build/rpi3/python/3.12.2/python-3.12.2/Lib/sysconfig.py", line 536, in _init_posix
_temp = __import__(name, globals(), locals(), ['build_time_vars'], 0)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named '_sysconfigdata__linux_aarch64-linux-gnu'
make: *** [Makefile:1144: checksharedmods] Error 1 So the challenge is we need the local path to find the foreign libraries to check they exist, but specifying the local path in PYTHONPATH causes scripts used during the build process to attempt to load external modules, like math, from the target instead of the host. Maybe one option is to not specify the path in PYTHONPATH, so that the initial module imports work and use the host's libraries, but then as part of |
I am seeing something very similar when cross compiling from illumos x86-64 to illumos aarch64.
The link worked, the |
That's pretty odd. It shouldn't be doing this for cross architecture builds, though my testing was limited to Debian and Buildroot based distros. Can you look at your config.log and find the PLATFORM_TRIPLET and MULTIARCH variables? Line 6752 in ae6c01d
Also cat the pybuilddir.txt file in the build directory. My guess is that something is not being correctly identified because it's an unknown libc or somehow the architecture isn't being set correctly and the import code thinks it can load the foreign architecture libs from the current build directory. |
PLATFORM_TRIPLET is none and MULTIARCH is empty,
I'll go and look more closely at |
No apologies necessary, I just hadn't seen this variant before. It could still be related (odd that it only fails at the very end when running this script) |
With a patched configure.ac, so that the triplet is now populated, I get this instead. I will look into it properly later today, and should probably get these local patches upstreamed too.
|
Just as an FYI, platform checking was reworked recently c163d7f For us, it's a bit of a bummer that uClibc is now explicitly disabled. We may try to get that support added in at some point, |
After correcting the host triple detection, everything's working fine for my cross compilation case. Thanks for the pointers @vfazio ! |
I was thinking about this last night and I think the options are:
In support of option 2 above... Setting PYTHONPATH feels like we're abusing the import mechanism a bit "because it works" in most cases. These build time scripts are special in that they need to run on the host and may want to leverage host provided libraries but perform parsing on files generated by the target build. The mechanism used in For this script in particular, we rely on Luckily, this does not (currently) get called by any dependency within the script, so we could reliably inject the build path into diff --git a/Tools/build/check_extension_modules.py b/Tools/build/check_extension_modules.py
index 59239c62e2..fe3423e4aa 100644
--- a/Tools/build/check_extension_modules.py
+++ b/Tools/build/check_extension_modules.py
@@ -140,9 +140,17 @@ class ModuleChecker:
def __init__(self, cross_compiling: bool = False, strict: bool = False):
self.cross_compiling = cross_compiling
self.strict_extensions_build = strict
+ self.builddir = self.get_builddir()
+
+ # Add path for cross modules prior to sysconfig parsing the makefile.
+ if self.cross_compiling:
+ if sysconfig._CONFIG_VARS_INITIALIZED:
+ sysconfig._CONFIG_VARS_INITIALIZED = False
+ sysconfig._CONFIG_VARS = None
+ sys.path.insert(1, os.path.join(sysconfig._PROJECT_BASE, self.builddir))
+
self.ext_suffix = sysconfig.get_config_var("EXT_SUFFIX")
self.platform = sysconfig.get_platform()
- self.builddir = self.get_builddir()
self.modules = self.get_modules()
self.builtin_ok = []
diff --git a/configure b/configure
index e962a6aed1..467962f75b 100755
--- a/configure
+++ b/configure
@@ -3686,7 +3686,7 @@ fi
fi
ac_cv_prog_PYTHON_FOR_REGEN=$with_build_python
PYTHON_FOR_FREEZE="$with_build_python"
- PYTHON_FOR_BUILD='_PYTHON_PROJECT_BASE=$(abs_builddir) _PYTHON_HOST_PLATFORM=$(_PYTHON_HOST_PLATFORM) PYTHONPATH=$(shell test -f pybuilddir.txt && echo $(abs_builddir)/`cat pybuilddir.txt`:)$(srcdir)/Lib _PYTHON_SYSCONFIGDATA_NAME=_sysconfigdata_$(ABIFLAGS)_$(MACHDEP)_$(MULTIARCH) '$with_build_python
+ PYTHON_FOR_BUILD='_PYTHON_PROJECT_BASE=$(abs_builddir) _PYTHON_HOST_PLATFORM=$(_PYTHON_HOST_PLATFORM) PYTHONPATH=$(srcdir)/Lib _PYTHON_SYSCONFIGDATA_NAME=_sysconfigdata_$(ABIFLAGS)_$(MACHDEP)_$(MULTIARCH) '$with_build_python
{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $with_build_python" >&5
printf "%s\n" "$with_build_python" >&6; }
diff --git a/configure.ac b/configure.ac
index 384718db1f..6b01083336 100644
--- a/configure.ac
+++ b/configure.ac
@@ -164,7 +164,7 @@ AC_ARG_WITH([build-python],
dnl Build Python interpreter is used for regeneration and freezing.
ac_cv_prog_PYTHON_FOR_REGEN=$with_build_python
PYTHON_FOR_FREEZE="$with_build_python"
- PYTHON_FOR_BUILD='_PYTHON_PROJECT_BASE=$(abs_builddir) _PYTHON_HOST_PLATFORM=$(_PYTHON_HOST_PLATFORM) PYTHONPATH=$(shell test -f pybuilddir.txt && echo $(abs_builddir)/`cat pybuilddir.txt`:)$(srcdir)/Lib _PYTHON_SYSCONFIGDATA_NAME=_sysconfigdata_$(ABIFLAGS)_$(MACHDEP)_$(MULTIARCH) '$with_build_python
+ PYTHON_FOR_BUILD='_PYTHON_PROJECT_BASE=$(abs_builddir) _PYTHON_HOST_PLATFORM=$(_PYTHON_HOST_PLATFORM) PYTHONPATH=$(srcdir)/Lib _PYTHON_SYSCONFIGDATA_NAME=_sysconfigdata_$(ABIFLAGS)_$(MACHDEP)_$(MULTIARCH) '$with_build_python
AC_MSG_RESULT([$with_build_python])
], [
AS_VAR_IF([cross_compiling], [yes], I tested this on a cross architecture and "same architecture but foreign libc" build and both builds completed though i haven't run the test suites. Changing PYTHONPATH obviously would affect more than this script, so all other callers would need to be evaluated to ensure they're still working as intended. This change is almost a decade old so there may be some ingrained assumptions issue 15484 commit: 9731330 This doesn't mean we can't do some combination of all options. Option 1 may be the easiest bandaid and has less chance of impacting other scripts until we've had time to vet them. |
Thanks for the patch, it fixes the problem
when cross-compiling for x86-64-v2/glibc-2.38:
|
A patch to convert diff --git a/Tools/build/check_extension_modules.py b/Tools/build/check_extension_modules.py
index a9fee4981e..ab6138750c 100644
--- a/Tools/build/check_extension_modules.py
+++ b/Tools/build/check_extension_modules.py
@@ -22,7 +22,6 @@
import enum
import logging
import os
-import pathlib
import re
import sys
import sysconfig
@@ -33,7 +32,7 @@
from importlib.util import spec_from_file_location, spec_from_loader
from typing import Iterable
-SRC_DIR = pathlib.Path(__file__).parent.parent.parent
+SRC_DIR = os.path.dirname(os.path.dirname(os.path.dirname(__file__)))
# core modules, hard-coded in Modules/config.h.in
CORE_MODULES = {
@@ -133,7 +132,7 @@ class ModuleChecker:
"Modules/Setup.local",
"Modules/Setup.stdlib",
"Modules/Setup.bootstrap",
- SRC_DIR / "Modules/Setup",
+ os.path.join(SRC_DIR, "Modules/Setup"),
)
def __init__(self, cross_compiling: bool = False, strict: bool = False):
@@ -263,14 +262,13 @@ def list_module_names(self, *, all: bool = False) -> set:
names.update(WINDOWS_MODULES)
return names
- def get_builddir(self) -> pathlib.Path:
+ def get_builddir(self) -> str:
try:
with open(self.pybuilddir_txt, encoding="utf-8") as f:
builddir = f.read()
except FileNotFoundError:
logger.error("%s must be run from the top build directory", __file__)
raise
- builddir = pathlib.Path(builddir)
logger.debug("%s: %s", self.pybuilddir_txt, builddir)
return builddir
@@ -338,7 +336,7 @@ def get_sysconfig_modules(self) -> Iterable[ModuleInfo]:
logger.debug("Found %s in Makefile", modinfo)
yield modinfo
- def parse_setup_file(self, setup_file: pathlib.Path) -> Iterable[ModuleInfo]:
+ def parse_setup_file(self, setup_file: str) -> Iterable[ModuleInfo]:
"""Parse a Modules/Setup file"""
assign_var = re.compile(r"^\w+=") # EGG_SPAM=foo
# default to static module
@@ -382,10 +380,10 @@ def get_spec(self, modinfo: ModuleInfo) -> ModuleSpec:
else:
raise ValueError(modinfo)
- def get_location(self, modinfo: ModuleInfo) -> pathlib.Path:
+ def get_location(self, modinfo: ModuleInfo) -> str:
"""Get shared library location in build directory"""
if modinfo.state == ModuleState.SHARED:
- return self.builddir / f"{modinfo.name}{self.ext_suffix}"
+ return os.path.join(self.builddir, f"{modinfo.name}{self.ext_suffix}")
else:
return None
@@ -430,23 +428,33 @@ def rename_module(self, modinfo: ModuleInfo) -> None:
failed_name = f"{modinfo.name}_failed{self.ext_suffix}"
builddir_path = self.get_location(modinfo)
- if builddir_path.is_symlink():
+ if os.path.islink(builddir_path):
symlink = builddir_path
- module_path = builddir_path.resolve().relative_to(os.getcwd())
- failed_path = module_path.parent / failed_name
+ real_path = os.path.realpath(builddir_path)
+ cwd = os.getcwd()
+ if not real_path.startswith(cwd):
+ raise ValueError(f"{real_path} is not in the subpath of {cwd}")
+ module_path = real_path.partition(cwd)[2]
+ failed_path = os.path.join(os.path.dirname(module_path), failed_name)
else:
symlink = None
module_path = builddir_path
- failed_path = self.builddir / failed_name
+ failed_path = os.path.join(self.builddir, failed_name)
# remove old failed file
- failed_path.unlink(missing_ok=True)
+ try:
+ os.unlink(failed_path)
+ except FileNotFoundError:
+ pass
# remove symlink
if symlink is not None:
- symlink.unlink(missing_ok=True)
+ try:
+ os.unlink(symlink)
+ except FileNotFoundError:
+ pass
# rename shared extension file
try:
- module_path.rename(failed_path)
+ os.rename(module_path, failed_path)
except FileNotFoundError:
logger.debug("Shared extension file '%s' does not exist.", module_path)
else: |
I'm not sure 3.13 is currently affected. 15de493 changed pathlib to delay import of However, the root problem is still there. If a library is compiled as a shared module and it's a dependency of a build script, there's a risk of the foreign module being imported. Build targets that use
I don't know anything about The reason Buildroot doesn't see more failures is because we specifically configure After review, I don't feel confident in dropping the build directory from PYTHONPATH for these other build targets, especially in a patch to 3.12.x Honestly, as much of a hack as it may look like, it may be easiest to delay the import of diff --git a/Lib/pathlib.py b/Lib/pathlib.py
index bd5a096f9e..544290d718 100644
--- a/Lib/pathlib.py
+++ b/Lib/pathlib.py
@@ -17,7 +17,6 @@
from _collections_abc import Sequence
from errno import ENOENT, ENOTDIR, EBADF, ELOOP
from stat import S_ISDIR, S_ISLNK, S_ISREG, S_ISSOCK, S_ISBLK, S_ISCHR, S_ISFIFO
-from urllib.parse import quote_from_bytes as urlquote_from_bytes
__all__ = [
@@ -479,7 +478,8 @@ def as_uri(self):
# It's a posix path => 'file:///etc/hosts'
prefix = 'file://'
path = str(self)
- return prefix + urlquote_from_bytes(os.fsencode(path))
+ from urllib.parse import quote_from_bytes
+ return prefix + quote_from_bytes(os.fsencode(path))
@property
def _str_normcase(self): |
@vfazio If it helps... I have used your patch in a Buildroot build with success! The build was done from Buildroot commit 8ab4a0a348 which is of course a commit above 36e635d2d5c0166476858aa239ccbe78e8f2af14 (package/python3: bump version to 3.12.1). Without the patch, I get the exact error than the one described here. With the patch, no error! |
@bendebled which patch? i've unfortunately proposed 3 of them and even I'm starting to lose track. |
Your latest one: diff --git a/Lib/pathlib.py b/Lib/pathlib.py
index bd5a096f9e..544290d718 100644
--- a/Lib/pathlib.py
+++ b/Lib/pathlib.py
@@ -17,7 +17,6 @@
from _collections_abc import Sequence
from errno import ENOENT, ENOTDIR, EBADF, ELOOP
from stat import S_ISDIR, S_ISLNK, S_ISREG, S_ISSOCK, S_ISBLK, S_ISCHR, S_ISFIFO
-from urllib.parse import quote_from_bytes as urlquote_from_bytes
__all__ = [
@@ -479,7 +478,8 @@ def as_uri(self):
# It's a posix path => 'file:///etc/hosts'
prefix = 'file://'
path = str(self)
- return prefix + urlquote_from_bytes(os.fsencode(path))
+ from urllib.parse import quote_from_bytes
+ return prefix + quote_from_bytes(os.fsencode(path))
@property
def _str_normcase(self): |
Looks like OE has a patch that reinjects the host libraries into sys.path: https://git.openembedded.org/openembedded-core/tree/meta/recipes-devtools/python/python3/crosspythonpath.patch They rely on a special CROSSPYTHONPATH variable exported in their recipe, but that could probably be emulated with something like (untested): diff --git a/Makefile.pre.in b/Makefile.pre.in
index dd5e69f7ab..c2b94c2cbf 100644
--- a/Makefile.pre.in
+++ b/Makefile.pre.in
@@ -786,6 +786,10 @@ $(BUILDPYTHON): Programs/python.o $(LINK_PYTHON_DEPS)
platform: $(PYTHON_FOR_BUILD_DEPS) pybuilddir.txt
$(RUNSHARED) $(PYTHON_FOR_BUILD) -c 'import sys ; from sysconfig import get_platform ; print("%s-%d.%d" % (get_platform(), *sys.version_info[:2]))' >platform
+# Must be generated before pybuilddir.txt otherwise sys.path includes foreign libraries
+crosspath.txt:
+ $(RUNSHARED) $(PYTHON_FOR_BUILD) -c 'import sys ; print(":".join([p for p in sys.path if p]))' >crosspath.txt
+
# Create build directory and generate the sysconfig build-time data there.
# pybuilddir.txt contains the name of the build dir and is used for
# sys.path fixup -- see Modules/getpath.c.
@@ -793,7 +797,7 @@ platform: $(PYTHON_FOR_BUILD_DEPS) pybuilddir.txt
# problems by creating a dummy pybuilddir.txt just to allow interpreter
# initialization to succeed. It will be overwritten by generate-posix-vars
# or removed in case of failure.
-pybuilddir.txt: $(PYTHON_FOR_BUILD_DEPS)
+pybuilddir.txt: $(PYTHON_FOR_BUILD_DEPS) crosspath.txt
@echo "none" > ./pybuilddir.txt
$(RUNSHARED) $(PYTHON_FOR_BUILD) -S -m sysconfig --generate-posix-vars ;\
if test $$? -ne 0 ; then \
diff --git a/configure b/configure
index e962a6aed1..dbce87ac68 100755
--- a/configure
+++ b/configure
@@ -3686,7 +3686,7 @@ fi
fi
ac_cv_prog_PYTHON_FOR_REGEN=$with_build_python
PYTHON_FOR_FREEZE="$with_build_python"
- PYTHON_FOR_BUILD='_PYTHON_PROJECT_BASE=$(abs_builddir) _PYTHON_HOST_PLATFORM=$(_PYTHON_HOST_PLATFORM) PYTHONPATH=$(shell test -f pybuilddir.txt && echo $(abs_builddir)/`cat pybuilddir.txt`:)$(srcdir)/Lib _PYTHON_SYSCONFIGDATA_NAME=_sysconfigdata_$(ABIFLAGS)_$(MACHDEP)_$(MULTIARCH) '$with_build_python
+ PYTHON_FOR_BUILD='_PYTHON_PROJECT_BASE=$(abs_builddir) _PYTHON_HOST_PLATFORM=$(_PYTHON_HOST_PLATFORM) PYTHONPATH=$(shell test -f crosspath.txt && echo $(abs_builddir)/`cat crosspath.txt`:)$(shell test -f pybuilddir.txt && echo $(abs_builddir)/`cat pybuilddir.txt`:)$(srcdir)/Lib _PYTHON_SYSCONFIGDATA_NAME=_sysconfigdata_$(ABIFLAGS)_$(MACHDEP)_$(MULTIARCH) '$with_build_python
{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $with_build_python" >&5
printf "%s\n" "$with_build_python" >&6; }
diff --git a/configure.ac b/configure.ac
index 384718db1f..69db53b1bc 100644
--- a/configure.ac
+++ b/configure.ac
@@ -164,7 +164,7 @@ AC_ARG_WITH([build-python],
dnl Build Python interpreter is used for regeneration and freezing.
ac_cv_prog_PYTHON_FOR_REGEN=$with_build_python
PYTHON_FOR_FREEZE="$with_build_python"
- PYTHON_FOR_BUILD='_PYTHON_PROJECT_BASE=$(abs_builddir) _PYTHON_HOST_PLATFORM=$(_PYTHON_HOST_PLATFORM) PYTHONPATH=$(shell test -f pybuilddir.txt && echo $(abs_builddir)/`cat pybuilddir.txt`:)$(srcdir)/Lib _PYTHON_SYSCONFIGDATA_NAME=_sysconfigdata_$(ABIFLAGS)_$(MACHDEP)_$(MULTIARCH) '$with_build_python
+ PYTHON_FOR_BUILD='_PYTHON_PROJECT_BASE=$(abs_builddir) _PYTHON_HOST_PLATFORM=$(_PYTHON_HOST_PLATFORM) PYTHONPATH=$(shell test -f crosspath.txt && echo $(abs_builddir)/`cat crosspath.txt`:)$(shell test -f pybuilddir.txt && echo $(abs_builddir)/`cat pybuilddir.txt`:)$(srcdir)/Lib _PYTHON_SYSCONFIGDATA_NAME=_sysconfigdata_$(ABIFLAGS)_$(MACHDEP)_$(MULTIARCH) '$with_build_python
AC_MSG_RESULT([$with_build_python])
], [
AS_VAR_IF([cross_compiling], [yes], However, I'm not a big fan of trying to reinsert the host libraries to the top, it just proves that PYTHONPATH should not have the foreign libraries in the path and that build scripts need to accept some flag that a cross compile is happening and can accept a path for the local build directory if they need to amend sys.path or perform some operation relative to the build directory |
As we discussed on IRC, this won't fix all targets
|
Tested using buildroot and it also fixes the "libm.so.6: version `GLIBC_2.38' not found" cross-compile error, thanks! |
I expect Buildroot will need to carry a unique patch to address the build issue there. As mentioned in #115382 (comment), BR disables ensurepip and does not compile target pyc files via the Makefile so the other problematic paths are obviated. I think we can either go the The general issue still stands, however, and I haven't had time to think about the best way to approach resolving it. The whole cross compile and build script situation is predicated on a series of environment variables and assumptions. Things that need to be taken into consideration:
Now, with all of that said, I think we could do something akin to: diff --git a/Tools/build/check_extension_modules.py b/Tools/build/check_extension_modules.py
index 59239c62e2..45fdd7fe09 100644
--- a/Tools/build/check_extension_modules.py
+++ b/Tools/build/check_extension_modules.py
@@ -125,6 +125,23 @@ def __bool__(self):
ModuleInfo = collections.namedtuple("ModuleInfo", "name state")
+class SysConfigShim:
+
+ def __init__(self, path: pathlib.Path):
+ data_file = path / (os.environ.get("_PYTHON_SYSCONFIGDATA_NAME") + ".py")
+ self.data: dict[str, str] = {}
+ exec(data_file.read_text(), globals(), self.data)
+
+ def get_config_var(self, name: str):
+ return self.get_config_vars().get(name)
+
+ def get_config_vars(self, *args):
+ if args:
+ vals = []
+ for name in args:
+ vals.append(self.data['build_time_vars'].get(name))
+ return vals
+ return self.data['build_time_vars']
class ModuleChecker:
pybuilddir_txt = "pybuilddir.txt"
@@ -139,10 +159,15 @@ class ModuleChecker:
def __init__(self, cross_compiling: bool = False, strict: bool = False):
self.cross_compiling = cross_compiling
+ self.builddir = self.get_builddir()
+ if self.cross_compiling:
+ shim = SysConfigShim(self.builddir)
+ sysconfig.get_config_var = shim.get_config_var
+ sysconfig.get_config_vars = shim.get_config_vars
+
self.strict_extensions_build = strict
self.ext_suffix = sysconfig.get_config_var("EXT_SUFFIX")
self.platform = sysconfig.get_platform()
- self.builddir = self.get_builddir()
self.modules = self.get_modules()
self.builtin_ok = [] The goal here is to replace the problematic
if '_PYTHON_HOST_PLATFORM' in os.environ:
build_dir_base = os.environ.get("_PYTHON_PROJECT_BASE")
pybuilddir = os.path.join(build_dir_base, "pybuilddir.txt")
with open(pybuilddir, encoding="utf-8") as f:
builddir = f.read()
sysconfig_path = os.path.join(build_dir_base, builddir, os.environ.get("_PYTHON_SYSCONFIGDATA_NAME") + ".py")
with open(sysconfig_path) as f:
data = f.read()
loc = {}
exec(data, globals(), loc)
_WHEEL_PKG_DIR = loc['build_time_vars'].get('WHEEL_PKG_DIR')
else:
_WHEEL_PKG_DIR = sysconfig.get_config_var('WHEEL_PKG_DIR') This may allow us to drop the target's path from PYTHONPATH in configure. From what i can tell, compileall and the wasm_assets likely won't be impacted by any changes to PYTHONPATH |
Previously, when a build was configured to use a host interpreter via --with-build-python, the PYTHON_FOR_BUILD config value included a path in PYTHONPATH that pointed to the target's built external modules. For "normal" foreign architecture cross compiles, when loading compiled external libraries, the target libraries were processed first due to their precedence in sys.path. These libraries are then ruled out because of a mismatch in the SOABI so the import mechanism continues searching in sys.path for modules until it finds the host's native modules. However, if the host interpreter and the target python are on the same version + SOABI combination, the host interpreter would attempt to load the target's external modules due precedence in sys.path. Despite the "match", the target build may be linked against a different libc or may include instructions that are not supported on the host, so loading/executing the target's external modules can lead to crashes. Now, the path to the target's external modules is no longer in PYTHONPATH to prevent accidentally loading these foreign modules. Some build scripts need to interrogate sysconfig via `get_config_var{s}` to determine what target modules were built as well as other target specific config values. This was previously done by specifying _PYTHON_SYSCONFIGDATA_NAME in the environment and leveraging the target's module path in PYTHONPATH so it could be imported in sysconfig._init_posix. These build scripts now check if the environment is configured to use a host interpreter and will now load the target's sysconfigdata module based on the information in the environment and query it as necessary. Signed-off-by: Vincent Fazio <vfazio@gmail.com>
Previously, when a build was configured to use a host interpreter via --with-build-python, the PYTHON_FOR_BUILD config value included a path in PYTHONPATH that pointed to the target's built external modules. For "normal" foreign architecture cross compiles, when loading compiled external libraries, the target libraries were processed first due to their precedence in sys.path. These libraries are then ruled out because of a mismatch in the SOABI so the import mechanism continues searching in sys.path for modules until it finds the host's native modules. However, if the host interpreter and the target python are on the same version + SOABI combination, the host interpreter would attempt to load the target's external modules due precedence in sys.path. Despite the "match", the target build may be linked against a different libc or may include instructions that are not supported on the host, so loading/executing the target's external modules can lead to crashes. Now, the path to the target's external modules is no longer in PYTHONPATH to prevent accidentally loading these foreign modules. Some build scripts need to interrogate sysconfig via `get_config_var{s}` to determine what target modules were built as well as other target specific config values. This was previously done by specifying _PYTHON_SYSCONFIGDATA_NAME in the environment and leveraging the target's module path in PYTHONPATH so it could be imported in sysconfig._init_posix. These build scripts now check if the environment is configured to use a host interpreter and will now load the target's sysconfigdata module based on the information in the environment and query it as necessary. Signed-off-by: Vincent Fazio <vfazio@gmail.com>
Previously, when a build was configured to use a host interpreter via --with-build-python, the PYTHON_FOR_BUILD config value included a path in PYTHONPATH that pointed to the target's built external modules. For "normal" foreign architecture cross compiles, when loading compiled external libraries, the target libraries were processed first due to their precedence in sys.path. These libraries are then ruled out because of a mismatch in the SOABI so the import mechanism continues searching in sys.path for modules until it finds the host's native modules. However, if the host interpreter and the target python are on the same version + SOABI combination, the host interpreter would attempt to load the target's external modules due precedence in sys.path. Despite the "match", the target build may be linked against a different libc or may include instructions that are not supported on the host, so loading/executing the target's external modules can lead to crashes. Now, the path to the target's external modules is no longer in PYTHONPATH to prevent accidentally loading these foreign modules. Some build scripts need to interrogate sysconfig via `get_config_var{s}` to determine what target modules were built as well as other target specific config values. This was previously done by specifying _PYTHON_SYSCONFIGDATA_NAME in the environment and leveraging the target's module path in PYTHONPATH so it could be imported in sysconfig._init_posix. These build scripts now check if the environment is configured to use a host interpreter and will now load the target's sysconfigdata module based on the information in the environment and query it as necessary. Signed-off-by: Vincent Fazio <vfazio@gmail.com>
I have two branches that I'm going to try to test out. Anyone else is welcome to try the patch and provide feedback. /~https://github.com/vfazio/cpython/tree/vfazio-fix-cross-compile-main /~https://github.com/vfazio/cpython/tree/vfazio-fix-cross-compile-3.12 |
A buildroot build is fixed by the patch vfazio@1ee8231 from this branch, thanks! |
@bkuhls thanks for confirming. I just confirmed it doesn't break anything for an actual foreign (target arch != host arch) cross compile. My cross compile is without any other patches, so the pyc compile stage does run and it did so successfully. I haven't run the actual test suite on the generated installation yet, so will do that soon. I do need to test the ensurepip stuff is working still |
This wont work either... ensurepip will fail when it tries to bootstrap pip into DESTDIR. I could hack it and append the target's sysconfigdata path to the end of
However, what this means is:
Putting the path at the beginning for host == target arch replicates the issue we're trying to fix. For this to work, the checks for a cross build environment would have to extend into pip which sounds absolutely arduous. I'm starting to run dry of ideas on how to cleanly handle this case... My last bad idea is to shift the path to pybuilddir.txt from PYTHONPATH to In testing, this works for a foreign cross build with ensurepip, so maybe it solves all of our problems. |
Previously, when a build was configured to use a host interpreter via --with-build-python, the PYTHON_FOR_BUILD config value included a path in PYTHONPATH that pointed to the target's built external modules. For "normal" foreign architecture cross compiles, when loading compiled external libraries, the target libraries were processed first due to their precedence in sys.path. These libraries were then ruled out due to a mismatch in the SOABI so the import mechanism continues searching for modules until it found the host's native modules. However, if the host interpreter and the target python are on the same version + SOABI combination, the host interpreter would attempt to load the target's external modules due to their precedence in sys.path. Despite the "match", the target build may be linked against a different libc or may include instructions that are not supported on the host, so loading/executing the target's external modules can lead to crashes. Now, the path to the target's external modules is no longer defined in PYTHONPATH to prevent accidentally loading these foreign modules. One caveat is that during certain build stages, the target's sysconfig module requires higher precedence than the host's version in order to accurately query the target build's configuration. This worked previously due to the target's sysconfig data module having precedence over the host's (see above). In order to keep this desired behavior, a new environment variable, _PYTHON_SYSCONFIGDATA_PATH, has been defined so sysconfig can search this directory for the target's sysconfig data. Signed-off-by: Vincent Fazio <vfazio@gmail.com>
Previously, when a build was configured to use a host interpreter via --with-build-python, the PYTHON_FOR_BUILD config value included a path in PYTHONPATH that pointed to the target's built external modules. For "normal" foreign architecture cross compiles, when loading compiled external libraries, the target libraries were processed first due to their precedence in sys.path. These libraries were then ruled out due to a mismatch in the SOABI so the import mechanism continues searching for modules until it found the host's native modules. However, if the host interpreter and the target python are on the same version + SOABI combination, the host interpreter would attempt to load the target's external modules due to their precedence in sys.path. Despite the "match", the target build may be linked against a different libc or may include instructions that are not supported on the host, so loading/executing the target's external modules can lead to crashes. Now, the path to the target's external modules is no longer defined in PYTHONPATH to prevent accidentally loading these foreign modules. One caveat is that during certain build stages, the target's sysconfig module requires higher precedence than the host's version in order to accurately query the target build's configuration. This worked previously due to the target's sysconfig data module having precedence over the host's (see above). In order to keep this desired behavior, a new environment variable, _PYTHON_SYSCONFIGDATA_PATH, has been defined so sysconfig can search this directory for the target's sysconfig data. Signed-off-by: Vincent Fazio <vfazio@gmail.com>
Previously, when a build was configured to use a host interpreter via --with-build-python, the PYTHON_FOR_BUILD config value included a path in PYTHONPATH that pointed to the target's built external modules. For "normal" foreign architecture cross compiles, when loading compiled external libraries, the target libraries were processed first due to their precedence in sys.path. These libraries were then ruled out due to a mismatch in the SOABI so the import mechanism continues searching for modules until it found the host's native modules. However, if the host interpreter and the target python are on the same version + SOABI combination, the host interpreter would attempt to load the target's external modules due to their precedence in sys.path. Despite the "match", the target build may be linked against a different libc or may include instructions that are not supported on the host, so loading/executing the target's external modules can lead to crashes. Now, the path to the target's external modules is no longer defined in PYTHONPATH to prevent accidentally loading these foreign modules. One caveat is that during certain build stages, the target's sysconfig module requires higher precedence than the host's version in order to accurately query the target build's configuration. This worked previously due to the target's sysconfig data module having precedence over the host's (see above). In order to keep this desired behavior, a new environment variable, _PYTHON_SYSCONFIGDATA_PATH, has been defined so sysconfig can search this directory for the target's sysconfig data. Signed-off-by: Vincent Fazio <vfazio@gmail.com>
Previously, when a build was configured to use a host interpreter via --with-build-python, the PYTHON_FOR_BUILD config value included a path in PYTHONPATH that pointed to the target's built external modules. For "normal" foreign architecture cross compiles, when loading compiled external libraries, the target libraries were processed first due to their precedence in sys.path. These libraries were then ruled out due to a mismatch in the SOABI so the import mechanism continues searching for modules until it found the host's native modules. However, if the host interpreter and the target python are on the same version + SOABI combination, the host interpreter would attempt to load the target's external modules due to their precedence in sys.path. Despite the "match", the target build may be linked against a different libc or may include instructions that are not supported on the host, so loading/executing the target's external modules can lead to crashes. Now, the path to the target's external modules is no longer defined in PYTHONPATH to prevent accidentally loading these foreign modules. One caveat is that during certain build stages, the target's sysconfig module requires higher precedence than the host's version in order to accurately query the target build's configuration. This worked previously due to the target's sysconfig data module having precedence over the host's (see above). In order to keep this desired behavior, a new environment variable, _PYTHON_SYSCONFIGDATA_PATH, has been defined so sysconfig can search this directory for the target's sysconfig data. Signed-off-by: Vincent Fazio <vfazio@gmail.com>
Previously, when a build was configured to use a host interpreter via --with-build-python, the PYTHON_FOR_BUILD config value included a path in PYTHONPATH that pointed to the target's built external modules. For "normal" foreign architecture cross compiles, when loading compiled external libraries, the target libraries were processed first due to their precedence in sys.path. These libraries were then ruled out due to a mismatch in the SOABI so the import mechanism continues searching for modules until it found the host's native modules. However, if the host interpreter and the target python are on the same version + SOABI combination, the host interpreter would attempt to load the target's external modules due to their precedence in sys.path. Despite the "match", the target build may be linked against a different libc or may include instructions that are not supported on the host, so loading/executing the target's external modules can lead to crashes. Now, the path to the target's external modules is no longer defined in PYTHONPATH to prevent accidentally loading these foreign modules. One caveat is that during certain build stages, the target's sysconfig module requires higher precedence than the host's version in order to accurately query the target build's configuration. This worked previously due to the target's sysconfig data module having precedence over the host's (see above). In order to keep this desired behavior, a new environment variable, _PYTHON_SYSCONFIGDATA_PATH, has been defined so sysconfig can search this directory for the target's sysconfig data. Signed-off-by: Vincent Fazio <vfazio@gmail.com>
Previously, when a build was configured to use a host interpreter via --with-build-python, the PYTHON_FOR_BUILD config value included a path in PYTHONPATH that pointed to the target's built external modules. For "normal" foreign architecture cross compiles, when loading compiled external libraries, the target libraries were processed first due to their precedence in sys.path. These libraries were then ruled out due to a mismatch in the SOABI so the import mechanism continues searching for modules until it found the host's native modules. However, if the host interpreter and the target python are on the same version + SOABI combination, the host interpreter would attempt to load the target's external modules due to their precedence in sys.path. Despite the "match", the target build may be linked against a different libc or may include instructions that are not supported on the host, so loading/executing the target's external modules can lead to crashes. Now, the path to the target's external modules is no longer defined in PYTHONPATH to prevent accidentally loading these foreign modules. One caveat is that during certain build stages, the target's sysconfig module requires higher precedence than the host's version in order to accurately query the target build's configuration. This worked previously due to the target's sysconfig data module having precedence over the host's (see above). In order to keep this desired behavior, a new environment variable, _PYTHON_SYSCONFIGDATA_PATH, has been defined so sysconfig can search this directory for the target's sysconfig data. Signed-off-by: Vincent Fazio <vfazio@gmail.com>
Previously, when a build was configured to use a host interpreter via --with-build-python, the PYTHON_FOR_BUILD config value included a path in PYTHONPATH that pointed to the target's built external modules. For "normal" foreign architecture cross compiles, when loading compiled external libraries, the target libraries were processed first due to their precedence in sys.path. These libraries were then ruled out due to a mismatch in the SOABI so the import mechanism continued searching until it found the host's native modules. However, if the host interpreter and the target python were on the same version + SOABI combination, the host interpreter would attempt to load the target's external modules due to their precedence in sys.path. Despite the "match", the target build may have been linked against a different libc or may include unsupported instructions so loading or executing the target's external modules can lead to crashes. Now, the path to the target's external modules is no longer defined in PYTHONPATH to prevent accidentally loading these foreign modules. One caveat is that during certain build stages, the target's sysconfig module requires higher precedence than the host's version in order to accurately query the target build's configuration. This worked previously due to the target's sysconfig data module having precedence over the host's (see above). In order to keep this desired behavior, a new environment variable, _PYTHON_SYSCONFIGDATA_PATH, has been defined so sysconfig can search this directory for the target's sysconfig data. Signed-off-by: Vincent Fazio <vfazio@gmail.com>
Previously, when a build was configured to use a host interpreter via --with-build-python, the PYTHON_FOR_BUILD config value included a path in PYTHONPATH that pointed to the target's built external modules. For "normal" foreign architecture cross compiles, when loading compiled external libraries, the target libraries were processed first due to their precedence in sys.path. These libraries were then ruled out due to a mismatch in the SOABI so the import mechanism continued searching until it found the host's native modules. However, if the host interpreter and the target python were on the same version + SOABI combination, the host interpreter would attempt to load the target's external modules due to their precedence in sys.path. Despite the "match", the target build may have been linked against a different libc or may include unsupported instructions so loading or executing the target's external modules can lead to crashes. Now, the path to the target's external modules is no longer defined in PYTHONPATH to prevent accidentally loading these foreign modules. One caveat is that during certain build stages, the target's sysconfig module requires higher precedence than the host's version in order to accurately query the target build's configuration. This worked previously due to the target's sysconfig data module having precedence over the host's (see above). In order to keep this desired behavior, a new environment variable, _PYTHON_SYSCONFIGDATA_PATH, has been defined so sysconfig can search this directory for the target's sysconfig data. Signed-off-by: Vincent Fazio <vfazio@gmail.com>
Previously, when a build was configured to use a host interpreter via --with-build-python, the PYTHON_FOR_BUILD config value included a path in PYTHONPATH that pointed to the target's built external modules. For "normal" foreign architecture cross compiles, when loading compiled external libraries, the target libraries were processed first due to their precedence in sys.path. These libraries were then ruled out due to a mismatch in the SOABI so the import mechanism continued searching until it found the host's native modules. However, if the host interpreter and the target python were on the same version + SOABI combination, the host interpreter would attempt to load the target's external modules due to their precedence in sys.path. Despite the "match", the target build may have been linked against a different libc or may include unsupported instructions so loading or executing the target's external modules can lead to crashes. Now, the path to the target's external modules is no longer defined in PYTHONPATH to prevent accidentally loading these foreign modules. One caveat is that during certain build stages, the target's sysconfig module requires higher precedence than the host's version in order to accurately query the target build's configuration. This worked previously due to the target's sysconfig data module having precedence over the host's (see above). In order to keep this desired behavior, a new environment variable, _PYTHON_SYSCONFIGDATA_PATH, has been defined so sysconfig can search this directory for the target's sysconfig data. Signed-off-by: Vincent Fazio <vfazio@gmail.com>
So, I've pushed an MR that I believe should address this issue. The commit message sort of speaks for itself, but I'll detail what's going on just as a summary post. When When a host interpreter is used, the command line to invoke that interpreter includes PYTHONPATH which points to the target build. This puts the target's paths higher in the sys.path import search order so the target's libraries are prioritized over the host's versions. For foreign architecture builds (think build=x86_64, host=arm64), when imports are searched and the module is a compiled module, the const char *_PyImport_DynLoadFiletab[] = {
#ifdef __CYGWIN__
".dll",
#else /* !__CYGWIN__ */
"." SOABI ".so",
#ifdef ALT_SOABI
"." ALT_SOABI ".so",
#endif
".abi" PYTHON_ABI_STRING ".so",
".so",
#endif /* __CYGWIN__ */
NULL,
}; This means that the host interpreter will never load the target compiled libraries so long as they disagree on SOABI. However, when the host interpreter and the target have the same SOABI (so host=x86_64, build=x86_64), there's a risk that due to their precedence in sys.path that the target's libraries will be loaded and/or executed. As seen in this issue, this is problematic because the target may have been built and linked against a different libc implementation which the host cannot load. Even if the libc matched, there's a chance that the instruction set for the generated binaries may not be executable on the host (x86 psABI differences). The only time it's "OK" to run the target libraries is if the imports and their dependencies are pure Python. From what I can naively tell, the only real reason it's necessary to include the target's path in sys.path via PYTHONPATH is so the target's generated sysconfigdata module can be interrogated to check for built modules, compile options, etc. If this is indeed the case, it should be sufficient to drop the path from PYTHONPATH and build an override mechanism for its path similar to how the name is already handled via In build testing, this works fine. The Python CI pipelines also seem to be ok with this. |
Previously, when a build was configured to use a host interpreter via --with-build-python, the PYTHON_FOR_BUILD config value included a path in PYTHONPATH that pointed to the target's built external modules. For "normal" foreign architecture cross compiles, when loading compiled external libraries, the target libraries were processed first due to their precedence in sys.path. These libraries were then ruled out due to a mismatch in the SOABI so the import mechanism continued searching until it found the host's native modules. However, if the host interpreter and the target python were on the same version + SOABI combination, the host interpreter would attempt to load the target's external modules due to their precedence in sys.path. Despite the "match", the target build may have been linked against a different libc or may include unsupported instructions so loading or executing the target's external modules can lead to crashes. Now, the path to the target's external modules is no longer defined in PYTHONPATH to prevent accidentally loading these foreign modules. One caveat is that during certain build stages, the target's sysconfig module requires higher precedence than the host's version in order to accurately query the target build's configuration. This worked previously due to the target's sysconfig data module having precedence over the host's (see above). In order to keep this desired behavior, a new environment variable, _PYTHON_SYSCONFIGDATA_PATH, has been defined so sysconfig can search this directory for the target's sysconfig data. Signed-off-by: Vincent Fazio <vfazio@gmail.com>
Note that CPython 3.11 also fails when running An example build:
I tested the patch in #116294 and adapted it for 3.11 and it seemed to resolve the issue. |
Co-authored-by: Erlend E. Aasland <erlend@python.org>
(cherry picked from commit aecbc2e) Co-authored-by: Vincent Fazio <vfazio@gmail.com> Co-authored-by: Erlend E. Aasland <erlend@python.org>
Co-authored-by: Vincent Fazio <vfazio@gmail.com> Co-authored-by: Erlend E. Aasland <erlend@python.org>
Closing this issue as my PR has landed in main and been ported to 3.13. |
Co-authored-by: Erlend E. Aasland <erlend@python.org>
Bug report
Bug description:
Note: I'm using 3.11.6 to showcase the behavior because that's easiest, but the build problems exist in 3.12+
When cross compiling Python, typically the foreign build is targeting a different architecture, but this is not always the case.
It's possible that an x86-64 host may be building a Python for a "foreign" x86-64 machine. This typically means that there may be some difference in libc version or CPU instruction support.
When performing a cross compile for the same architecture (by this I mean the combination of Python version + SOABI), Python 3.12+ will attempt to load foreign libraries as part of some of the target dependencies for the
build_all
make target and will potentially fail.When cross compiling, builds specify
--with-build-python
during configure which specifies a host-safe version of python to use to perform python based tasks on behalf of the foreign python build. When configured,PYTHON_FOR_BUILD
will be set to a rather complex command that generally evaluates to something like:This is currently a problem for
checksharedmods
:When run, it tries to run
check_extension_modules
viaPYTHON_FOR_BUILD
, however this script has nested in its import dependencies a dependency on externally built modules which poses a problem.Note, this was introduced as part of 3.12:
7bd67d1
81dca70d704
This also can show up with glibc like so
When the
PYTHON_FOR_BUILD
command is updated to not include the local build directory in PYTHONPATH, everything works fine.This makes sense because it will use the host's libraries instead of the target's libraries... Inserting that path into PYTHONPATH alters the search order for extensions, and because the python versions match (this is a requirement for cross builds) and SOABI matches (which is due to the triplets being the same), it will try to use the libraries from the target path:
So, the "easy solution" is to remove:
However, I assume that was added for a reason and is maybe necessary for other steps (though in testing, it didn't appear to be necessary for the compile to complete).
I'm not sure it ever makes sense to load external modules from the foreign target python when cross compiling, This only works currently because for foreign architecture builds, the triplet will mismatch and cause it to fall back to the host's versions of the libraries. It'd only be safe to do this if the modules were completely source based.
If
PYTHONPATH
has to be set, then there's maybe an argument that python version and SOABI do not provide enough differentiation and I'm not sure of what the best solution is there.Alternatively, the
check_extension_modules
script could be rewritten to reduce the use of external modules to perform its parsing and temporarily work around this issue, but a similar issue could be reintroduced in a subsequent script without policies on what can and can't be used in scripts called during build.CPython versions tested on:
3.12
3.11
Operating systems tested on:
Linux
Linked PRs
The text was updated successfully, but these errors were encountered: