Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-98686: Get rid of BINARY_OP_GENERIC and COMPARE_OP_GENERIC #99399

Merged
merged 3 commits into from
Nov 17, 2022

Conversation

brandtbucher
Copy link
Member

@brandtbucher brandtbucher commented Nov 11, 2022

This optimization made more sense back before we had exponential backoff (and the unquickened versions of these instructions were just laying around waiting to be used). Nowadays this move just complicates things, and doesn't actually seem to improve performance:

All benchmarks:
===============

Slower (16):
- regex_effbot: 3.34 ms +- 0.02 ms -> 3.69 ms +- 0.01 ms: 1.10x slower
- regex_v8: 21.2 ms +- 0.1 ms -> 22.7 ms +- 0.2 ms: 1.07x slower
- regex_dna: 202 ms +- 1 ms -> 210 ms +- 1 ms: 1.04x slower
- telco: 6.25 ms +- 0.22 ms -> 6.38 ms +- 0.16 ms: 1.02x slower
- genshi_xml: 46.9 ms +- 0.9 ms -> 47.7 ms +- 0.5 ms: 1.02x slower
- mdp: 2.51 sec +- 0.02 sec -> 2.56 sec +- 0.01 sec: 1.02x slower
- pathlib: 17.3 ms +- 0.2 ms -> 17.5 ms +- 0.4 ms: 1.01x slower
- hexiom: 6.06 ms +- 0.02 ms -> 6.14 ms +- 0.04 ms: 1.01x slower
- scimark_fft: 311 ms +- 3 ms -> 314 ms +- 3 ms: 1.01x slower
- scimark_sparse_mat_mult: 4.07 ms +- 0.07 ms -> 4.11 ms +- 0.11 ms: 1.01x slower
- mako: 9.52 ms +- 0.06 ms -> 9.59 ms +- 0.08 ms: 1.01x slower
- scimark_sor: 103 ms +- 1 ms -> 104 ms +- 1 ms: 1.01x slower
- logging_silent: 91.6 ns +- 1.1 ns -> 92.1 ns +- 1.4 ns: 1.01x slower
- nbody: 94.0 ms +- 1.1 ms -> 94.5 ms +- 1.1 ms: 1.01x slower
- gunicorn: 1.07 ms +- 0.01 ms -> 1.08 ms +- 0.01 ms: 1.00x slower
- regex_compile: 128 ms +- 2 ms -> 128 ms +- 1 ms: 1.00x slower

Faster (35):
- unpack_sequence: 46.1 ns +- 4.1 ns -> 43.3 ns +- 0.6 ns: 1.06x faster
- genshi_text: 21.1 ms +- 0.2 ms -> 20.5 ms +- 0.3 ms: 1.03x faster
- pidigits: 199 ms +- 0 ms -> 193 ms +- 0 ms: 1.03x faster
- deepcopy_reduce: 2.94 us +- 0.04 us -> 2.86 us +- 0.04 us: 1.03x faster
- spectral_norm: 95.9 ms +- 3.8 ms -> 93.8 ms +- 2.1 ms: 1.02x faster
- unpickle_pure_python: 204 us +- 2 us -> 200 us +- 2 us: 1.02x faster
- sqlglot_parse: 1.34 ms +- 0.01 ms -> 1.31 ms +- 0.01 ms: 1.02x faster
- pickle_list: 4.03 us +- 0.03 us -> 3.96 us +- 0.05 us: 1.02x faster
- generators: 78.7 ms +- 0.5 ms -> 77.4 ms +- 0.5 ms: 1.02x faster
- json_loads: 24.2 us +- 0.4 us -> 23.9 us +- 0.2 us: 1.02x faster
- chameleon: 6.50 ms +- 0.08 ms -> 6.41 ms +- 0.09 ms: 1.01x faster
- sqlglot_transpile: 1.63 ms +- 0.02 ms -> 1.60 ms +- 0.02 ms: 1.01x faster
- deepcopy_memo: 34.1 us +- 0.5 us -> 33.6 us +- 0.9 us: 1.01x faster
- raytrace: 281 ms +- 5 ms -> 278 ms +- 2 ms: 1.01x faster
- coverage: 98.1 ms +- 1.2 ms -> 96.8 ms +- 1.1 ms: 1.01x faster
- xml_etree_iterparse: 104 ms +- 2 ms -> 103 ms +- 2 ms: 1.01x faster
- deepcopy: 328 us +- 3 us -> 324 us +- 3 us: 1.01x faster
- xml_etree_generate: 76.9 ms +- 0.7 ms -> 76.1 ms +- 0.6 ms: 1.01x faster
- xml_etree_process: 53.2 ms +- 0.7 ms -> 52.6 ms +- 0.6 ms: 1.01x faster
- sympy_str: 283 ms +- 4 ms -> 280 ms +- 4 ms: 1.01x faster
- crypto_pyaes: 74.1 ms +- 0.7 ms -> 73.4 ms +- 0.8 ms: 1.01x faster
- sympy_sum: 163 ms +- 1 ms -> 162 ms +- 2 ms: 1.01x faster
- pyflate: 394 ms +- 4 ms -> 391 ms +- 4 ms: 1.01x faster
- sqlglot_normalize: 106 ms +- 1 ms -> 105 ms +- 1 ms: 1.01x faster
- sympy_expand: 457 ms +- 3 ms -> 453 ms +- 5 ms: 1.01x faster
- django_template: 32.6 ms +- 0.5 ms -> 32.4 ms +- 0.5 ms: 1.01x faster
- sqlglot_optimize: 50.9 ms +- 0.2 ms -> 50.6 ms +- 0.3 ms: 1.01x faster
- float: 71.5 ms +- 0.8 ms -> 71.1 ms +- 0.9 ms: 1.01x faster
- sympy_integrate: 20.3 ms +- 0.1 ms -> 20.2 ms +- 0.1 ms: 1.01x faster
- chaos: 67.7 ms +- 1.0 ms -> 67.3 ms +- 0.7 ms: 1.01x faster
- pickle_pure_python: 282 us +- 2 us -> 281 us +- 3 us: 1.00x faster
- pickle_dict: 31.3 us +- 0.1 us -> 31.3 us +- 0.1 us: 1.00x faster
- python_startup: 8.60 ms +- 0.01 ms -> 8.58 ms +- 0.01 ms: 1.00x faster
- python_startup_no_site: 6.27 ms +- 0.01 ms -> 6.26 ms +- 0.01 ms: 1.00x faster
- 2to3: 246 ms +- 1 ms -> 246 ms +- 1 ms: 1.00x faster

Benchmark hidden because not significant (31): aiohttp, async_tree_none, async_tree_cpu_io_mixed, async_tree_io, async_tree_memoization, coroutines, deltablue, dulwich_log, fannkuch, go, html5lib, json, json_dumps, logging_format, logging_simple, meteor_contest, mypy, nqueens, pickle, pprint_safe_repr, pprint_pformat, pycparser, richards, scimark_lu, scimark_monte_carlo, sqlite_synth, thrift, tornado_http, unpickle, unpickle_list, xml_etree_parse

Geometric mean: 1.00x faster

@brandtbucher brandtbucher added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Nov 11, 2022
@brandtbucher brandtbucher self-assigned this Nov 11, 2022
Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the merge conflict. :-/

Don't you need to change the magic number? All the opcodes are different.

@brandtbucher
Copy link
Member Author

Thanks for the merge conflict. :-/

I'll wait until after #99313 is merged to land this. Anything else in flight that I should wait for?

Don't you need to change the magic number? All the opcodes are different.

We don't need to bump the magic number when adding/removing/changing specialized instructions or superinstructions, since those never get included in pycs! It's a pretty nice design.

@gvanrossum
Copy link
Member

In that case I welcome your review of #99313.

@gvanrossum
Copy link
Member

You have the baton. Have fun fixing the merge conflicts. :-)

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this means that un-specializable binary ops and comparisons will now be a teensy bit slower because they always have to check the counter. But it seems clear that happens rarely enough.

@brandtbucher brandtbucher merged commit 8555dee into python:main Nov 17, 2022
@gvanrossum gvanrossum deleted the generic-ops branch November 17, 2022 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants