Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Cache module dirs #566

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Zirak
Copy link

@Zirak Zirak commented Jan 4, 2025

Calling dir on modules is expensive - in a benchmark, it took up 2% of the time when calling freeze_time, when _get_cached_module_attributes accounted for 4% of the time.

This commit hashes dir calls per module id. This drastically speeds it up. Running with pytest-benchmark:

import time
from datetime import datetime, timezone
from unittest.mock import patch

from freezegun import freeze_time

def setup_with_freezegun() -> None:
    current = datetime(2025, 1, 1, tzinfo=timezone.utc)
    with freeze_time(current):
        time.time()

def setup_with_patch() -> None:
    current = datetime(2025, 1, 1, tzinfo=timezone.utc)
    current_time = current.timestamp()
    with patch("time.time", return_value=current_time):
        time.time()

def test_benchmark_freezegun(benchmark) -> None:
    benchmark(setup_with_freezegun)

def test_benchmark_patch(benchmark) -> None:
    benchmark(setup_with_patch)

We get the following:

----------------------------------------------------------------------------------------------------- benchmark: 4 tests -----------------------------------------------------------------------------------------------------
Name (time in us)                                  Min                   Max                  Mean              StdDev                Median                IQR            Outliers          OPS            Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_patch (0001_5f171db)            62.3360 (1.0)      8,936.2020 (14.71)       84.5774 (1.0)      175.7804 (7.06)        70.2175 (1.0)      13.8290 (1.0)         5;839  11,823.4918 (1.0)        4730           1
test_benchmark_patch (0002_5f171db)            62.3370 (1.00)     9,183.8990 (15.11)       86.2260 (1.02)     192.4761 (7.73)        72.3760 (1.03)     14.2655 (1.03)        3;676  11,597.4350 (0.98)       3949           1
test_benchmark_freezegun (0002_5f171db)       376.8070 (6.04)       607.6380 (1.0)        405.6626 (4.80)      24.8866 (1.0)        400.5450 (5.70)     19.8857 (1.44)         10;4   2,465.1026 (0.21)        121           1
test_benchmark_freezegun (0001_5f171db)     1,983.5520 (31.82)    2,392.4130 (3.94)     2,103.8566 (24.87)     74.1705 (2.98)     2,083.2020 (29.67)    69.4740 (5.02)         14;5     475.3176 (0.04)         82           1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Where 0001 is before this commit, and 0002 is after: Mean time reduced 2,103 ms to 405 ms.

This comes with a tradeoff, reflected in the added tests: Dynamically added attributes will not be picked up. For example:

time_after_start = None

def add_after_start() -> None:
    import time
    import sys

    setattr(sys.modules[__name__], 'dynamic_time_func', time.time)

Because the result of dir is cached, the dynamic_time_func is not picked up.

I leave it up to the maintainers to decide if this tradeoff is worthwhile, considering that there are some other existing blindspots (also shown in added tests).

This commit does not include the benchmark itself - if desired, it can trivially be added.

Calling `dir` on modules is expensive - in a benchmark, it took up 2% of the
time when calling `freeze_time`, when `_get_cached_module_attributes` accounted
for 4% of the time.

This commit hashes `dir` calls per module id. This drastically speeds it
up. Running with `pytest-benchmark`:

```python
import time
from datetime import datetime, timezone
from unittest.mock import patch

from freezegun import freeze_time

def setup_with_freezegun() -> None:
    current = datetime(2025, 1, 1, tzinfo=timezone.utc)
    with freeze_time(current):
        time.time()

def setup_with_patch() -> None:
    current = datetime(2025, 1, 1, tzinfo=timezone.utc)
    current_time = current.timestamp()
    with patch("time.time", return_value=current_time):
        time.time()

def test_benchmark_freezegun(benchmark) -> None:
    benchmark(setup_with_freezegun)

def test_benchmark_patch(benchmark) -> None:
    benchmark(setup_with_patch)
```

We get the following:
```
----------------------------------------------------------------------------------------------------- benchmark: 4 tests -----------------------------------------------------------------------------------------------------
Name (time in us)                                  Min                   Max                  Mean              StdDev                Median                IQR            Outliers          OPS            Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_patch (0001_5f171db)            62.3360 (1.0)      8,936.2020 (14.71)       84.5774 (1.0)      175.7804 (7.06)        70.2175 (1.0)      13.8290 (1.0)         5;839  11,823.4918 (1.0)        4730           1
test_benchmark_patch (0002_5f171db)            62.3370 (1.00)     9,183.8990 (15.11)       86.2260 (1.02)     192.4761 (7.73)        72.3760 (1.03)     14.2655 (1.03)        3;676  11,597.4350 (0.98)       3949           1
test_benchmark_freezegun (0002_5f171db)       376.8070 (6.04)       607.6380 (1.0)        405.6626 (4.80)      24.8866 (1.0)        400.5450 (5.70)     19.8857 (1.44)         10;4   2,465.1026 (0.21)        121           1
test_benchmark_freezegun (0001_5f171db)     1,983.5520 (31.82)    2,392.4130 (3.94)     2,103.8566 (24.87)     74.1705 (2.98)     2,083.2020 (29.67)    69.4740 (5.02)         14;5     475.3176 (0.04)         82           1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
```

Where `0001` is before this commit, and `0002` is after: Mean time reduced 2,103
ms to 405 ms.

This comes with a tradeoff, reflected in the added tests: Dynamically added
attributes *will not be picked up*. For example:

```python
time_after_start = None

def add_after_start() -> None:
    import time
    import sys

    setattr(sys.modules[__name__], 'dynamic_time_func', time.time)
```

Because the result of `dir` is cached, the `dynamic_time_func` is not picked
up.

I leave it up to the maintainers to decide if this tradeoff is worthwhile,
considering that there are some other existing blindspots (also shown in added
tests).

This commit does not include the benchmark itself - if desired, it can trivially
be added.
@Zirak
Copy link
Author

Zirak commented Jan 4, 2025

Here's the result of pytest-benchmark compare --histogram:
benchmark_20250104_171648

Here's the icicle chart with snakeviz from before the commit:
image

And after:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant