Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyType_IsSubtype is doing excessive work in the common case #89860

Closed
itamaro mannequin opened this issue Nov 3, 2021 · 11 comments
Closed

PyType_IsSubtype is doing excessive work in the common case #89860

itamaro mannequin opened this issue Nov 3, 2021 · 11 comments
Labels
3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@itamaro
Copy link
Mannequin

itamaro mannequin commented Nov 3, 2021

BPO 45697
Nosy @serhiy-storchaka, @sweeneyde, @Fidget-Spinner, @itamaro, @arhadthedev
PRs
  • bpo-45697: PyType_IsSubtype: Return early for identical type objects #29380
  • bpo-45697: Use PyObject_TypeCheck in type_call and binary_op1 #29392
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-11-04.10:41:48.112>
    created_at = <Date 2021-11-03.01:12:41.206>
    labels = ['interpreter-core', '3.11', 'performance']
    title = 'PyType_IsSubtype is doing excessive work in the common case'
    updated_at = <Date 2021-11-04.10:41:48.111>
    user = '/~https://github.com/itamaro'

    bugs.python.org fields:

    activity = <Date 2021-11-04.10:41:48.111>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-11-04.10:41:48.112>
    closer = 'serhiy.storchaka'
    components = ['Interpreter Core']
    creation = <Date 2021-11-03.01:12:41.206>
    creator = 'itamaro'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 45697
    keywords = ['patch']
    message_count = 11.0
    messages = ['405575', '405577', '405581', '405586', '405589', '405596', '405633', '405642', '405654', '405674', '405675']
    nosy_count = 5.0
    nosy_names = ['serhiy.storchaka', 'Dennis Sweeney', 'kj', 'itamaro', 'arhadthedev']
    pr_nums = ['29380', '29392']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue45697'
    versions = ['Python 3.11']

    @itamaro
    Copy link
    Mannequin Author

    itamaro mannequin commented Nov 3, 2021

    Based on real world profiling data we collected, a vast amount of PyType_IsSubtype calls are coming from type_call, when it decides whether __init__ should run or not.

    In the common case, the arguments to this call are identical, but the implementation still walks the MRO.

    By returning early for identical types, the common case can be optimized with a non-trivial performance gain.

    @itamaro itamaro mannequin added 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Nov 3, 2021
    @sweeneyde
    Copy link
    Member

    Interesting. It seems like several call sites already check the equality case:

    ---- setobject.h: ----

    #define PyFrozenSet_Check(ob) \
        (Py_IS_TYPE(ob, &PyFrozenSet_Type) || \
          PyType_IsSubtype(Py_TYPE(ob), &PyFrozenSet_Type))

    ---- object.h: ----

    static inline int _PyObject_TypeCheck(PyObject *ob, PyTypeObject *type) {
        return Py_IS_TYPE(ob, type) || PyType_IsSubtype(Py_TYPE(ob), type);
    }
    #define PyObject_TypeCheck(ob, type) _PyObject_TypeCheck(_PyObject_CAST(ob), type)

    Perhaps it would be better to use PyObject_TypeCheck where applicable (such as in the type_call function you mention, and in abstract.c's binary_op1()).

    @itamaro
    Copy link
    Mannequin Author

    itamaro mannequin commented Nov 3, 2021

    Dennis, you mean something like this? itamaro@92d46b2

    sure, that would preempt the PyType_IsSubtype calls coming from these specific callsites, but the early return in PyType_IsSubtype would cover those as well as any other path, wouldn't it?

    @arhadthedev
    Copy link
    Member

    Dennis, can PyFrozenSet_Check and _PyObject_TypeCheck get rid of Py_IS_TYPE invocation then, so PyType_IsSubtype becomes the only source of truth here?

    @serhiy-storchaka
    Copy link
    Member

    The PyObject_TypeCheck() macro is used in performance critical cases. If it is not the case the PyType_IsSubtype() function can be used. Adding yet check in PyType_IsSubtype() will slow down more performance sensitive cases in which PyObject_TypeCheck() is used and speed up less performance sensitive cases. So there is no reason to add it.

    @Fidget-Spinner
    Copy link
    Member

    Dennis, you mean something like this? itamaro@92d46b2

    Not Dennis, but these changes looks good. As Serhiy mentioned, we can replace the hot callsites with PyObject_TypeCheck without slowing down PyType_IsSubtype.

    @itamaro
    Copy link
    Mannequin Author

    itamaro mannequin commented Nov 3, 2021

    thanks for the feedback & discussion, I submitted a new PR

    @serhiy-storchaka
    Copy link
    Member

    Creating a new type takes microseconds, and using PyObject_TypeCheck() instead of PyType_IsSubtype() can save nanoseconds. So, while this replacement looks harmless, its effect can hardly be measured even in microbenchmarks.

    @itamaro
    Copy link
    Mannequin Author

    itamaro mannequin commented Nov 3, 2021

    thanks for the feedback Serhiy!

    repeating my response from the PR here as well (not sure what's the proper etiquette.. :-) )

    note that this code path is not for creating new types (which is slow as you say), but for creating new instances (which is quite fast).

    I ran some synthetic benchmarks with this change and without it (on an opt build with PGO and LTO), and the gain is measurable:

    build from main (commit acc89db which is the base commit for this PR):

    >for _ in {1..10}; do ./main-opt/python -m timeit -n 10000 -s 'class Spam: pass' -- 'ten_k_spams = [Spam() for _ in range(10000)]'; done
    10000 loops, best of 5: 896 usec per loop
    10000 loops, best of 5: 887 usec per loop
    10000 loops, best of 5: 857 usec per loop
    10000 loops, best of 5: 838 usec per loop
    10000 loops, best of 5: 847 usec per loop
    10000 loops, best of 5: 863 usec per loop
    10000 loops, best of 5: 845 usec per loop
    10000 loops, best of 5: 902 usec per loop
    10000 loops, best of 5: 890 usec per loop
    10000 loops, best of 5: 875 usec per loop
    

    build with this change applied (commit 2362bf6):

    >for _ in {1..10}; do ./test-opt/python -m timeit -n 10000 -s 'class Spam: pass' -- 'ten_k_spams = [Spam() for _ in range(10000)]'; done
    10000 loops, best of 5: 833 usec per loop
    10000 loops, best of 5: 885 usec per loop
    10000 loops, best of 5: 845 usec per loop
    10000 loops, best of 5: 838 usec per loop
    10000 loops, best of 5: 833 usec per loop
    10000 loops, best of 5: 827 usec per loop
    10000 loops, best of 5: 858 usec per loop
    10000 loops, best of 5: 811 usec per loop
    10000 loops, best of 5: 843 usec per loop
    10000 loops, best of 5: 845 usec per loop
    

    also worth noting that the previous approach (GH-29380) was implemented on the Instagram Server production workload in Meta/Facebook/Instagram (choose your favorite) and resulted measurable perf gain of around 0.2%

    @serhiy-storchaka
    Copy link
    Member

    New changeset 2c045bd by Itamar Ostricher in branch 'main':
    bpo-45697: Use PyObject_TypeCheck in type_call (GH-29392)
    2c045bd

    @serhiy-storchaka
    Copy link
    Member

    Thank you for your contribution Itamar.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    facebook-github-bot pushed a commit to facebookincubator/cinder that referenced this issue Dec 27, 2022
    Summary:
    Upstream issue: python/cpython#89860
    Upstream PR: python/cpython#29392
    
    This also ports D14351113 + D32175230 (7b5bc58) from 3.8 to 3.10
    
    Reviewed By: alexmalyshev
    
    Differential Revision: D42229775
    
    fbshipit-source-id: 0bd735a
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants