Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast seek through file #4736

Merged
merged 7 commits into from
Nov 27, 2023
Merged

Fast seek through file #4736

merged 7 commits into from
Nov 27, 2023

Conversation

cphyc
Copy link
Member

@cphyc cphyc commented Nov 7, 2023

This provides additional speedups on top of #4720. We should probably wait for the former to be merged before reviewing this one.
The two improvements are:

  1. Use a seek-based approach (rather than skipping the Fortran record) to read the AMR file
  2. When reading the hydro, entirely skip those levels for which we aren't reading any cell. This adds a small overhead which appears to be negligible, and even beneficial in some cases.

Timings

Using the same script as in #4720

Timing on main (s) Timing on #4720 (s) This PR (s)
AMR 4.2 4.2 2.5
hydro (all) 62 23 21
hydro (::2) 43 16 15
hydro (::4) 39 13 12
hydro (:4) 35 9 9

Using the same script as in #4720, but centring on the 10 kpc around the largest overdensity (located at [0.50303078, 0.49916649, 0.49960136]). This still includes ~800,000 cells.
In this region, most of the cells are at the highest resolution, which means the low-resolution regions can be skipped when reading the hydro files.

Timing on main (s) Timing on #4720 (s) This PR (s) This PR + #4734 (s)
AMR 4.2 4.2 2.5 0.8
hydro (all) 7.5 2.7 2.3 3.8
hydro (::2) 6.7 2.1 1.8 3.3
hydro (::4) 6.2 1.6 1.5 3.1
hydro (:4) 5.8 1.3 1.2 2.7

If we sum the AMR + hydro, we get the following:

This PR (s) This PR + #4734 (s)
AMR + hydro(all) 5.1 4.7
AMR + hydro(::2) 4.6 4.4
AMR + hydro(::4) 4.2 4.3
AMR + hydro(:4) 3.9 3.9

Note that, when using #4734, we only get a 2s penalty for dropping the bbox argument from yt.load (which, in our case, reduces the number of domains to check intersection with from 4096 to 1079).
On the cluster I've tested, the throughput is roughly 5µs/cell (200,000 cells/s) which is on the order of 1.5Mib/s. Note that this takes into account reading (a subset of) the AMR files, seeking through the hydro files and concat the resulting arrays with no a priori knowledge of the dataset (i.e. all indexing is coming from reading the files themselves).

@cphyc cphyc force-pushed the ramses-speedup-hydro branch from a50d5dd to 0bbac11 Compare November 7, 2023 21:55
@cphyc cphyc added enhancement Making something better code frontends Things related to specific frontends labels Nov 7, 2023
@cphyc cphyc force-pushed the ramses-speedup-hydro branch from 8a4212b to 921d84f Compare November 8, 2023 17:56
@cphyc cphyc marked this pull request as ready for review November 8, 2023 17:57
@cphyc
Copy link
Member Author

cphyc commented Nov 8, 2023

@yt-fido test this please

matthewturk
matthewturk previously approved these changes Nov 9, 2023
Copy link
Member

@matthewturk matthewturk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just great work! Thanks for such a detailed PR description -- and great changes.

yt/frontends/ramses/io_utils.pyx Outdated Show resolved Hide resolved
@cphyc
Copy link
Member Author

cphyc commented Nov 10, 2023

@yt-fido test this please

Copy link
Member

@neutrinoceros neutrinoceros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't say in all honesty that I understand all of it, but I also cannot say that about the code this is replacing. I'm alright with this going in as long as tests pass, given how clear the gain is.

cpu_list = [self.domain_id - 1]
fill_hydro(
fd,
file_handler.offset,
file_handler.level_count,
cpu_list,
levels,
level_inds,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain why you renamed this variable (as well as domain_inds) ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you read the line just below, you'll see cell_inds and file_inds. Before, I got confused by the different naming conventions and I had to read the code to actually understand all those arrays have the same shape. To make it more consistent, I renamed all the arrays to be whatever_inds.

@neutrinoceros neutrinoceros added this to the 4.4.0 milestone Nov 25, 2023
@cphyc cphyc merged commit f40a712 into yt-project:main Nov 27, 2023
11 checks passed
@cphyc cphyc deleted the ramses-speedup-hydro branch November 27, 2023 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code frontends Things related to specific frontends enhancement Making something better
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants