Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: handle in-memory layer overlaps with persistent layers #11000

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

VladLazar
Copy link
Contributor

@VladLazar VladLazar commented Feb 26, 2025

Problem

Image layers may be nested inside in-memory layers as diagnosed here. The read path doesn't support this and may skip over the image layer,
resulting in a failure to reconstruct the page.

Summary of changes

We already support nesting of image layers inside delta layers. The logic lives in LayerMap::select_layer.
The main goal of this PR is to propagate the candidate in-memory layer down to that point and update
the selection logic.

Important changes are:

  1. Support partial reads for the in-memory layer. Previously, we could only specify the start LSN of the read.
    We need to control the end LSN too.
  2. LayerMap::ranged_search considers in-memory layers too. Previously, the search for in-memory layers
    was done explicitly in Timeline::get_reconstruct_data_timeline. Note that LayerMap::ranged_search now returns
    a weak readable layer which the LayerManager can upgrade. This dance is such that we can unit test the layer selection logic.
  3. Update LayerMap::select_layer to consider the candidate in-memory layer too

Loosely related drive bys:

  1. Remove the "keys not found" tracking in the ranged search. This wasn't used anywhere and it just complicates things.
  2. Remove the difficulty map stuff from the layer map. Again, not used anywhere.

Closes #9185
Closes #10720

@VladLazar VladLazar force-pushed the vlad/handle-img-and-inmem-layer-overlaps branch from 3172c6c to e49730e Compare February 26, 2025 18:43
Copy link

github-actions bot commented Feb 26, 2025

7744 tests run: 7364 passed, 0 failed, 380 skipped (full report)


Flaky tests (1)

Postgres 14

Code coverage* (full report)

  • functions: 32.9% (8675 of 26388 functions)
  • lines: 48.9% (73864 of 151056 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
9adff69 at 2025-02-27T17:47:08.430Z :recycle:

@VladLazar VladLazar force-pushed the vlad/handle-img-and-inmem-layer-overlaps branch from e49730e to 0cd392b Compare February 27, 2025 16:01
@VladLazar VladLazar force-pushed the vlad/handle-img-and-inmem-layer-overlaps branch from 0cd392b to 9adff69 Compare February 27, 2025 16:28
@VladLazar VladLazar changed the title [WIP] pageserver: handle in-memory layer overlaps with persistent layers pageserver: handle in-memory layer overlaps with persistent layers Feb 27, 2025
@VladLazar VladLazar requested a review from skyzh February 27, 2025 16:38
@VladLazar VladLazar marked this pull request as ready for review February 27, 2025 16:38
@VladLazar VladLazar requested a review from a team as a code owner February 27, 2025 16:38
Copy link
Member

@skyzh skyzh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a first pass of the review and I think all the logic makes sense to me if we don't have any overlaps of delta/in-memory layers. I haven't looked at the test cases and I'll do it tomorrow.

I'm concerned about how do we handle the case of delta/in-memory layer overlap. In theory, this would not happen, because the flush layer operation creates a delta layer with the same LSN as the in-memory layer.

If it did happen, what kind of situations do we want to consider in this patch? i.e., delta layer key/LSN == in_memory layer? Or some logic errors creating overlapping but not exactly same layers? What about having a delta layer above the in-memory layer?

While the code looks correct to handle the general case where the system operates correctly, I think I'll need to understand which kind of anomalies we want to consider here and re-read the code to ensure that all such cases get handled in an expected way.

});
}
}
for range in unmapped_keyspace.ranges.iter() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking aloud: if we flush an in-memory layer to the disk, between two layer_map lock acquisition in the loop, correctness of this code path requires that the flushed delta layers have exactly the same content as the in-memory layer, which is true for now.

///
/// Layer types have an in implicit priority (image > delta > in-memory). For instance,
/// if we have the option of reading an LSN range from both an image and a delta, we
/// should read from the image.
Copy link
Member

@skyzh skyzh Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, we could probably refactor this function into

select_layer(delta, image, in_memory, end_lsn, image_processed)

(when we want to add partial image layer support -- for now we always stop when reading the image layer)

The LSN we use on the read path is actually a tuple of (LSN, bool image). We can think of the bool image_processed as a X.5 LSN. It might make the read path more clear (?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants