Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-94673: Add Per-Interpreter Storage for Static Builtin Types #94995

Closed

Conversation

ericsnowcurrently
Copy link
Member

@ericsnowcurrently ericsnowcurrently commented Jul 19, 2022

This is a precursor to storing tp_subclasses (and tp_weaklist) on the interpreter state for static builtin types. At a high level, we add the following:

  • _PyStaticType_InitBuiltin()
  • PyInterpreterState.types.builtins

We also shuffle some code around to be able to use _PyStaticType_InitBuiltin(), especially in Objects/structseq.c.

One thing to note is that we add a new "tp_" field: PyTypeObject.tp_static_builtin_index. If adding another field to PyTypeObject is too costly then we could conditionally re-purpose some other field (e.g. tp_subclasses once we don't use it for static types).

@ericsnowcurrently
Copy link
Member Author

For context, I have a branch on top of this one that takes care of tp_subclasses: main...ericsnowcurrently:shareable-static-types-tp_subclasses.

@kumaraditya303
Copy link
Contributor

Thanks for working on this!

Mostly LGTM, a few changes worth looking into:

  • The index should be reset after each finalization
  • Maybe avoid embedding the type cache in type state struct as it may negatively impact performance?
  • Maybe prefix tp_static_builtin_index with an underscore?

@ericsnowcurrently
Copy link
Member Author

Maybe avoid embedding the type cache in type state struct as it may negatively impact performance?

There should be no performance impact since it's still a single static offset from the pointer we were already dereferencing.

Maybe prefix tp_static_builtin_index with an underscore?

None of the other fields have a leading underscore, even the ones that are documented as internal.

That does remind me, I should also add the new field to the docs. (Or maybe I'll ditch the new field and not worry about it. 🙂)

@@ -65,6 +65,7 @@ lookup_maybe_method(PyObject *self, PyObject *attr, int *unbound);
static int
slot_tp_setattro(PyObject *self, PyObject *name, PyObject *value);


Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@ericsnowcurrently
Copy link
Member Author

The test_embed failure is because _PyStaticType_Dealloc() bails out if the type still has subclasses (it checks tp_subclasses). This implies that we're still leaking objects out of Py_Finalize(). I'll see what can be done.

@ericsnowcurrently
Copy link
Member Author

FYI, I tried printing some diagnostic info in _PyStaticType_Dealloc() for types with non-empty tp_subclasses.

minimal:

leaked dict: 1
    13 collections.defaultdict
leaked object: 26
    37 dict
     8 itertools.accumulate
    10 itertools.chain
     9 itertools.combinations
     9 itertools.combinations_with_replacement
     7 itertools.compress
     8 itertools.count
     8 itertools.cycle
     8 itertools.dropwhile
     7 itertools.filterfalse
     8 itertools.groupby
     8 itertools.islice
     6 itertools.pairwise
     9 itertools.permutations
     9 itertools.product
     9 itertools.repeat
     7 itertools.starmap
     8 itertools.takewhile
     8 itertools.zip_longest
     7 itertools._grouper
     8 itertools._tee
     5 itertools._tee_dataobject
    43 collections.deque
     8 _collections._deque_iterator
     8 _collections._deque_reverse_iterator
     9 _collections._tuplegetter

This boils down to static type subclasses in extension modules, since they don't get finalized (ever). Maybe we should ignore static types in tp_subclasses...

a more complex case

./python -E -c 'import sys ; from sysconfig import get_platform ; print("%s-%d.%d" % (get_platform(), *sys.version_info[:2])) (from make -j8):

leaked ZeroDivisionError: 2
     3 DivisionByZero
      (refcount: 2)
     2 DivisionUndefined
      (refcount: 2)
leaked TypeError: 1
     3 FloatOperation
      (refcount: 2)
leaked RuntimeError: 2
     3 RunFailedError
    15 ChannelError
leaked OSError: 2
     3 herror
     3 gaierror
leaked ArithmeticError: 2
     8 ZeroDivisionError
    30 DecimalException
leaked Exception: 12
    22 ArithmeticError
    17 OSError
    14 RuntimeError
     6 TypeError
     4 error
     3 InvalidStateError
     3 ArgumentError
     3 error
     3 RecursingInfinitelyError
     3 error
     3 OSSAudioError
     3 error
leaked BaseException: 2
    58 Exception
     3 CancelledError
leaked tuple: 2
     8 datetime.IsoCalendarDate
     2 DecimalTuple
leaked staticmethod: 1
     4 abstractstaticmethod
leaked property: 1
     4 abstractproperty
leaked list: 2
     3 MyList
     8 xxsubtype.spamlist
leaked dict: 3
    13 collections.defaultdict
     4 StgDict
     6 xxsubtype.spamdict
leaked classmethod: 1
     4 abstractclassmethod
leaked type: 7
    17 ABCMeta
    12 _ctypes.PyCStructType
    12 _ctypes.UnionType
    11 _ctypes.PyCPointerType
    10 _ctypes.PyCArrayType
    10 _ctypes.PyCSimpleType
    10 _ctypes.PyCFuncPtrType
leaked object: 91
    49 type
    13 classmethod
    42 dict
    41 list
    20 property
    14 staticmethod
    29 tuple
    59 BaseException
     9 ModuleSpec
     5 BuiltinImporter
     9 _LoaderBasics
    10 FileLoader
    16 _abc._abc_data
     2 ABC
     6 Iterable
     7 Sized
     6 Container
     8 itertools.accumulate
     9 itertools.combinations
     9 itertools.combinations_with_replacement
     8 itertools.cycle
     8 itertools.dropwhile
     8 itertools.takewhile
     8 itertools.islice
     7 itertools.starmap
    10 itertools.chain
     7 itertools.compress
     7 itertools.filterfalse
     8 itertools.count
     8 itertools.zip_longest
     6 itertools.pairwise
     9 itertools.permutations
     9 itertools.product
     9 itertools.repeat
     8 itertools.groupby
     7 itertools._grouper
     8 itertools._tee
     5 itertools._tee_dataobject
    43 collections.deque
     8 _collections._deque_iterator
     8 _collections._deque_reverse_iterator
     9 _collections._tuplegetter
     4 _IterationGuard
     5 WeakSet
    39 _socket.socket
    17 _struct.Struct
     7 _struct.unpack_iterator
     8 _asyncio.FutureIter
     5 TaskStepMethWrapper
     3 _RunningLoopHolder
    31 _asyncio.Future
     4 CArgObject
     2 _ctypes.CThunkObject
    21 _ctypes._CData
     9 _ctypes.CField
     4 _ctypes.DictRemover
     2 _ctypes.StructParam_Type
    76 _curses.window
    38 datetime.date
    29 datetime.time
    36 datetime.timedelta
    13 datetime.tzinfo
   100 decimal.Decimal
    85 decimal.Context
    21 decimal.SignalDictMixin
     5 decimal.ContextManager
     8 Number
     4 _elementtree._element_iterator
    10 xml.etree.ElementTree.TreeBuilder
    35 xml.etree.ElementTree.Element
    11 xml.etree.ElementTree.XMLParser
    17 _multiprocessing.SemLock
     2 _pickle.Pdata
     8 _pickle.PicklerMemoProxy
     8 _pickle.UnpicklerMemoProxy
    12 _pickle.Pickler
     9 _pickle.Unpickler
     9 matmulType
     4 ipowType
     7 awaitType
     3 GenericAlias
     3 Generic
     9 MethInstance
     9 MethClass
     9 MethStatic
     4 _testcapi.HeapCType
     4 _testcapi.ContainerNoGC
    11 MethodDescriptorBase
    18 _xxsubinterpreters.ChannelID
    26 ossaudiodev.oss_audio_device
    13 ossaudiodev.oss_mixer_device

@ericsnowcurrently
Copy link
Member Author

To solve the test_embed failure, I had to stop ignoring types in _PyStaticType_Dealloc() when tp_subclasses was not cleared. This means the subclasses (from non-PEP 489 extension modules) are leaking. However, they were already leaking.

@ericsnowcurrently
Copy link
Member Author

I'm going to split this PR up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants