Store prefetch results in LFC cache once as soon as they are received #10442

knizhnik · 2025-01-19T07:03:21Z

Problem

Prefetch is performed locally, so different backers can request the same pages form PS.
Such duplicated request increase load of page server and network traffic.
Making prefetch global seems to be very difficult and undesirable, because different queries can access chunks on different speed. Storing prefetch chunks in LFC will not completely eliminate duplicates, but can minimise such requests.

The problem with storing prefetch result in LFC is that in this case page is not protected by share buffer lock.
So we will have to perform extra synchronisation at LFC side.

See:
https://neondb.slack.com/archives/C0875PUD0LC/p1736772890602029?thread_ts=1736762541.116949&cid=C0875PUD0LC

@MMeent implementation of prewarm:
See #10312

Summary of changes

Use conditional variables to sycnhronize access to LFC entry.

github-actions · 2025-01-19T08:03:14Z

7590 tests run: 7213 passed, 0 failed, 377 skipped (full report)

Flaky tests (2)

Postgres 17

test_branched_from_many_empty_parents_size: debug-x86-64-without-lfc

Postgres 15

test_pull_timeline_while_evicted: release-x86-64-with-lfc

Code coverage* (full report)

functions: 32.9% (8617 of 26228 functions)
lines: 48.7% (72586 of 149106 lines)

* collected from Rust tests only

_{The comment gets automatically updated with the latest test results
a7ed5d3 at 2025-02-21T14:42:27.072Z :recycle:}

erikgrinaker

Thanks! I've only done a high-level review, but the approach makes sense to me. I expect the synchronization overhead would be negligible since this is on an IO path, but would be interested to see how benchmarks are affected.

MMeent

I'm missing the documentation as to why it's OK to side-load the data here without holding a buffer lock.

Could you add that info?

pgxn/neon/pagestore_smgr.c

pgxn/neon/file_cache.c

knizhnik · 2025-01-28T14:58:06Z

Below are some results illustrating importance of storing prefetch results in LFC.
Certainly it is very artificial workload, but still IMHO quite relevant.

Configuration:

neon.max_file_cache_size=1GB
neon.file_cache_size_limit=1GB
shared_buffers=256kB
random_page_cost=1
effective_io_concurrency=10

Database

create table t(pk integer, sk integer, filter text default repeat('x',200));
insert into values (generate_series(1,1000000),random()*1000000);
create index on t(sk);

Query (range-query.sql):

select sum(sk) from t where sk between 100000 and 100100;

Benchmark

pgbench -n -c 1 -M prepared -T 100 -f range-query.sql

Results:

| --------------------- | --------------- |
| branch                |          TPS    |
| --------------------- | --------------- |
| main                  |             273 |
| prefetch_in_lfc       |            1535 |
| sideload-prefetches   |            1492 |
| --------------------- | --------------- |```

pgxn/neon/pagestore_smgr.c

pgxn/neon/file_cache.c

pgxn/neon/pagestore_smgr.c

MMeent

One last comment: Please coordinate with the release plans for 17.3 this week before merging, so that we have an appropriate amount of time for this to stew on staging before this is released to all customers.

alexanderlaw · 2025-02-19T10:11:17Z

It looks like the test added would be flaky from the beginning. I tried to run it in parallel and got:

3       =================== 1 failed, 1 warning in 147.28s (0:02:27) ===================
4       =================== 1 failed, 1 warning in 148.79s (0:02:28) ===================
12      =================== 1 failed, 1 warning in 149.77s (0:02:29) ===================
13      =================== 1 failed, 1 warning in 149.77s (0:02:29) ===================
8       =================== 1 failed, 1 warning in 150.34s (0:02:30) ===================
14      =================== 1 failed, 1 warning in 150.32s (0:02:30) ===================
9       =================== 1 failed, 1 warning in 150.29s (0:02:30) ===================
2       =================== 1 failed, 1 warning in 150.29s (0:02:30) ===================
5       =================== 1 failed, 1 warning in 150.75s (0:02:30) ===================
1       =================== 1 failed, 1 warning in 150.77s (0:02:30) ===================
6       =================== 1 failed, 1 warning in 150.76s (0:02:30) ===================
11      =================== 1 failed, 1 warning in 150.76s (0:02:30) ===================
10      =================== 1 failed, 1 warning in 150.75s (0:02:30) ===================
7       =================== 1 failed, 1 warning in 150.75s (0:02:30) ===================
15      =================== 1 failed, 1 warning in 152.32s (0:02:32) ===================

All instances failed like this:

15              cur.execute("create table t(pk integer, sk integer, filler text default repeat('x',200))")
15      >       cur.execute("insert into t values (generate_series(1,1000000),random()*1000000)")
15      E       psycopg2.errors.QueryCanceled: canceling statement due to statement timeout

What do you think of increasing statement_timeout for that insert?

alexanderlaw · 2025-02-19T10:41:27Z

Moreover, I could also get the following error:

2       2025-02-19 12:37:06.627 INFO [test_lfc_prefetch_2.py:48] Unused prefetches: 0
2       2025-02-19 12:37:06.920 INFO [test_lfc_prefetch_2.py:54] Unused prefetches: 0
2       2025-02-19 12:37:07.314 INFO [test_lfc_prefetch_2.py:60] Unused prefetches: 31
2       2025-02-19 12:37:07.450 INFO [test_lfc_prefetch_2.py:66] Unused prefetches: 0

2               # if prefetch requests are not stored in LFC, we continue to sent unused prefetch request tyo PS
2       >       assert prefetch_expired > 0
2       E       assert 0 > 0

Or with the complete plan:

    2025-02-19 12:49:45.844 INFO [test_lfc_prefetch_5.py:70] prefetch_expired: 0
...
    cur.execute(
        "explain (analyze,prefetch,format json) select sum(pk) from (select pk from t where sk between 100000 and 200000 limit 100) s4"
    )
    prefetch_expired = cr = cur.fetchall()[0][0][0]["Plan"]["Prefetch Expired Requests"]

    >       assert prefetch_expired > 0
    E       assert 0 > 0

{
   "Plan":{
      "Node Type":"Aggregate",
      "Strategy":"Plain",
      "Partial Mode":"Simple",
      "Parallel Aware":false,
      "Async Capable":false,
      "Startup Cost":373.24,
      "Total Cost":373.25,
      "Plan Rows":1,
      "Plan Width":8,
      "Actual Startup Time":9.122,
      "Actual Total Time":9.125,
      "Actual Rows":1,
      "Actual Loops":1,
      "Prefetch Hits":0,
      "Prefetch Misses":0,
      "Prefetch Expired Requests":0,
      "Prefetch Duplicated Requests":0,
      "Plans":[
         {
            "Node Type":"Limit",
            "Parent Relationship":"Outer",
            "Parallel Aware":false,
            "Async Capable":false,
            "Startup Cost":0.42,
            "Total Cost":372.99,
            "Plan Rows":100,
            "Plan Width":4,
            "Actual Startup Time":0.156,
            "Actual Total Time":9.054,
            "Actual Rows":100,
            "Actual Loops":1,
            "Prefetch Hits":0,
            "Prefetch Misses":0,
            "Prefetch Expired Requests":0,
            "Prefetch Duplicated Requests":0,
            "Plans":[
               {
                  "Node Type":"Index Scan",
                  "Parent Relationship":"Outer",
                  "Parallel Aware":false,
                  "Async Capable":false,
                  "Scan Direction":"Forward",
                  "Index Name":"t_sk_idx",
                  "Relation Name":"t",
                  "Alias":"t",
                  "Startup Cost":0.42,
                  "Total Cost":18628.42,
                  "Plan Rows":5000,
                  "Plan Width":4,
                  "Actual Startup Time":0.153,
                  "Actual Total Time":9.017,
                  "Actual Rows":100,
                  "Actual Loops":1,
                  "Index Cond":"((sk >= 100000) AND (sk <= 200000))",
                  "Rows Removed by Index Recheck":0,
                  "Prefetch Hits":0,
                  "Prefetch Misses":0,
                  "Prefetch Expired Requests":0,
                  "Prefetch Duplicated Requests":0
               }
            ]
         }
      ]
   },
   "Planning Time":0.235,
   "Triggers":[
      
   ],
   "Execution Time":9.21
}

Wouldn't it make sense to debug the new test before merging this PR?

ololobus · 2025-02-19T13:09:24Z

Wouldn't it make sense to debug the new test before merging this PR?

+1, thanks for looking at it. @knizhnik can you please debug test failures?

knizhnik · 2025-02-19T16:23:52Z

@alexanderlaw - I wonder how are you running this test in parallel? Are there N independent Neon instances?

github-actions · 2025-02-19T16:27:47Z

If this PR added a GUC in the Postgres fork or neon extension,
please regenerate the Postgres settings in the cloud repo:

make NEON_WORKDIR=path/to/neon/checkout \
  -C goapp/internal/shareddomain/postgres generate

If you're an external contributor, a Neon employee will assist in
making sure this step is done.

alexanderlaw · 2025-02-19T16:32:15Z

Running this test in a loop and 5 instances in parallel, I can see the following numbers for the first 4 queries (with grep "Unused prefetches:") when it passes:

1       2025-02-19 17:06:30.244 INFO [test_lfc_prefetch_1.py:49] Unused prefetches: 0
1       2025-02-19 17:06:30.656 INFO [test_lfc_prefetch_1.py:55] Unused prefetches: 30
1       2025-02-19 17:06:30.803 INFO [test_lfc_prefetch_1.py:61] Unused prefetches: 51
1       2025-02-19 17:06:30.898 INFO [test_lfc_prefetch_1.py:68] Unused prefetches: 94
...

2       2025-02-19 17:06:28.909 INFO [test_lfc_prefetch_2.py:49] Unused prefetches: 0
2       2025-02-19 17:06:29.284 INFO [test_lfc_prefetch_2.py:55] Unused prefetches: 74
2       2025-02-19 17:06:29.735 INFO [test_lfc_prefetch_2.py:61] Unused prefetches: 90
2       2025-02-19 17:06:29.894 INFO [test_lfc_prefetch_2.py:68] Unused prefetches: 93
...

3       2025-02-19 17:06:28.121 INFO [test_lfc_prefetch_3.py:49] Unused prefetches: 0
3       2025-02-19 17:06:28.711 INFO [test_lfc_prefetch_3.py:55] Unused prefetches: 70
3       2025-02-19 17:06:29.132 INFO [test_lfc_prefetch_3.py:61] Unused prefetches: 97
3       2025-02-19 17:06:29.231 INFO [test_lfc_prefetch_3.py:68] Unused prefetches: 95
...

4       2025-02-19 17:06:28.472 INFO [test_lfc_prefetch_4.py:49] Unused prefetches: 0
4       2025-02-19 17:06:29.24  INFO [test_lfc_prefetch_4.py:55] Unused prefetches: 69
4       2025-02-19 17:06:29.369 INFO [test_lfc_prefetch_4.py:61] Unused prefetches: 97
4       2025-02-19 17:06:29.542 INFO [test_lfc_prefetch_4.py:68] Unused prefetches: 95
...

5       2025-02-19 17:06:29.483 INFO [test_lfc_prefetch_5.py:49] Unused prefetches: 0
5       2025-02-19 17:06:29.957 INFO [test_lfc_prefetch_5.py:55] Unused prefetches: 70
5       2025-02-19 17:06:30.498 INFO [test_lfc_prefetch_5.py:61] Unused prefetches: 75
5       2025-02-19 17:06:30.525 INFO [test_lfc_prefetch_5.py:68] Unused prefetches: 67
...

2       2025-02-19 17:33:38.28  INFO [test_lfc_prefetch_2.py:49] Unused prefetches: 0
2       2025-02-19 17:33:38.451 INFO [test_lfc_prefetch_2.py:55] Unused prefetches: 3
2       2025-02-19 17:33:39.208 INFO [test_lfc_prefetch_2.py:61] Unused prefetches: 98
2       2025-02-19 17:33:39.418 INFO [test_lfc_prefetch_2.py:68] Unused prefetches: 26

But when it fails, the numbers are:

2       2025-02-19 17:08:20.267 INFO [test_lfc_prefetch_2.py:49] Unused prefetches: 0
2       2025-02-19 17:08:20.866 INFO [test_lfc_prefetch_2.py:55] Unused prefetches: 0
2       2025-02-19 17:08:21.308 INFO [test_lfc_prefetch_2.py:61] Unused prefetches: 0
2       2025-02-19 17:08:21.328 INFO [test_lfc_prefetch_2.py:68] Unused prefetches: 0
...

5       2025-02-19 17:18:11.59  INFO [test_lfc_prefetch_5.py:49] Unused prefetches: 0
5       2025-02-19 17:18:11.198 INFO [test_lfc_prefetch_5.py:55] Unused prefetches: 0
5       2025-02-19 17:18:11.375 INFO [test_lfc_prefetch_5.py:61] Unused prefetches: 68
5       2025-02-19 17:18:11.379 INFO [test_lfc_prefetch_5.py:68] Unused prefetches: 0
...

3       2025-02-19 17:23:31.454 INFO [test_lfc_prefetch_3.py:49] Unused prefetches: 0
3       2025-02-19 17:23:31.693 INFO [test_lfc_prefetch_3.py:55] Unused prefetches: 14
3       2025-02-19 17:23:31.960 INFO [test_lfc_prefetch_3.py:61] Unused prefetches: 32
3       2025-02-19 17:23:31.967 INFO [test_lfc_prefetch_3.py:68] Unused prefetches: 0
...

alexanderlaw · 2025-02-19T18:52:36Z

@alexanderlaw - I wonder how are you running this test in parallel? Are there N independent Neon instances?

I prefer to use something like:

sed 's/\(BASE_PORT: int = 15000\)/\1 + 1000 * int(os.environ.get("PYTEST_JOB_NUMBER", "0"))/' -i test_runner/fixtures/neon_fixtures.py

for i in `seq 20`; do cp test_runner/regress/test_lfc_prefetch.py test_runner/regress/test_lfc_prefetch_$i.py; sed "s/test_lfc_prefetch(/test_lfc_prefetch_$i(/;" -i test_runner/regress/test_lfc_prefetch_$i.py; done

for i in `seq 30`; do echo "ITERATION $i"; date; parallel -j5 --linebuffer --tag POETRY_VIRTUALENVS_PATH=virtualenvs DEFAULT_PG_VERSION=17 BUILD_TYPE=debug PYTEST_JOB_NUMBER={} poetry run pytest test_runner/regress/test_lfc_prefetch_{}.py::test_lfc_prefetch_{}[debug-pg17] --  ::: `seq 5` >test.log 2>&1 || break; grep -E '[0-9]+ (pass|fail)' test.log; done

So I have N independent pytest and neon instances.

But the same effect wrt this test can also be achieved with just:

@@ -10,10 +10,12 @@ from fixtures.utils import USE_LFC
 
 @pytest.mark.timeout(600)
 @pytest.mark.skipif(not USE_LFC, reason="LFC is disabled, skipping")
-def test_lfc_prefetch(neon_simple_env: NeonEnv):
+@pytest.mark.parametrize("n", range(5))
+def test_lfc_prefetch(neon_simple_env: NeonEnv, n: int):
     """
     Test resizing the Local File Cache
     """
+    _ = n

for i in `seq 30`; do echo "ITERATION $i"; date; POETRY_VIRTUALENVS_PATH=virtualenvs BUILD_TYPE=debug DEFAULT_PG_VERSION=17 poetry run pytest -n8 test_runner/regress/test_lfc_prefetch.py  >test.log 2>&1 || break; done

The above recipe produces for me:

ITERATION 2
...
=================================== FAILURES ===================================
_______________________ test_lfc_prefetch[debug-pg17-2] ________________________
...
        # if prefetch requests are not stored in LFC, we continue to sent unused prefetch request tyo PS
>       assert prefetch_expired > 0
E       assert 0 > 0

alexanderlaw · 2025-02-20T07:52:10Z

Commit c0af808 makes the test stable:

1       2025-02-20 08:09:24.352 INFO [test_lfc_prefetch_1.py:51] Unused prefetches: 0
1       2025-02-20 08:09:24.711 INFO [test_lfc_prefetch_1.py:57] Unused prefetches: 31
1       2025-02-20 08:09:24.873 INFO [test_lfc_prefetch_1.py:63] Unused prefetches: 32
1       2025-02-20 08:09:24.912 INFO [test_lfc_prefetch_1.py:69] Unused prefetches: 71
1       2025-02-20 08:09:25.197 INFO [test_lfc_prefetch_1.py:80] Unused prefetches: 97
1       2025-02-20 08:09:25.527 INFO [test_lfc_prefetch_1.py:86] Unused prefetches: 95
1       2025-02-20 08:09:25.831 INFO [test_lfc_prefetch_1.py:92] Unused prefetches: 97
1       2025-02-20 08:09:25.838 INFO [test_lfc_prefetch_1.py:98] Unused prefetches: 0

2       2025-02-20 08:09:27.942 INFO [test_lfc_prefetch_2.py:51] Unused prefetches: 0
2       2025-02-20 08:09:28.206 INFO [test_lfc_prefetch_2.py:57] Unused prefetches: 31
2       2025-02-20 08:09:28.292 INFO [test_lfc_prefetch_2.py:63] Unused prefetches: 32
2       2025-02-20 08:09:28.338 INFO [test_lfc_prefetch_2.py:69] Unused prefetches: 71
2       2025-02-20 08:09:28.578 INFO [test_lfc_prefetch_2.py:80] Unused prefetches: 97
2       2025-02-20 08:09:28.708 INFO [test_lfc_prefetch_2.py:86] Unused prefetches: 95
2       2025-02-20 08:09:28.937 INFO [test_lfc_prefetch_2.py:92] Unused prefetches: 97
2       2025-02-20 08:09:28.940 INFO [test_lfc_prefetch_2.py:98] Unused prefetches: 0

3       2025-02-20 08:09:25.725 INFO [test_lfc_prefetch_3.py:51] Unused prefetches: 0
3       2025-02-20 08:09:25.984 INFO [test_lfc_prefetch_3.py:57] Unused prefetches: 31
3       2025-02-20 08:09:26.122 INFO [test_lfc_prefetch_3.py:63] Unused prefetches: 32
3       2025-02-20 08:09:26.143 INFO [test_lfc_prefetch_3.py:69] Unused prefetches: 71
3       2025-02-20 08:09:26.369 INFO [test_lfc_prefetch_3.py:80] Unused prefetches: 97
3       2025-02-20 08:09:26.528 INFO [test_lfc_prefetch_3.py:86] Unused prefetches: 95
3       2025-02-20 08:09:26.740 INFO [test_lfc_prefetch_3.py:92] Unused prefetches: 97
3       2025-02-20 08:09:26.746 INFO [test_lfc_prefetch_3.py:98] Unused prefetches: 0

Thank you!

alexanderlaw · 2025-02-20T08:34:44Z

I think, I've discovered another anomaly related to this PR, investigating...

alexanderlaw · 2025-02-20T11:06:41Z

Please try the following modification for test_pg_regress:
pg_regress-with-LFC-resize-on-fly.patch.txt

It makes test_runner/regress/test_pg_regress.py::test_pg_regress[release-pg17-None] or [debug-...] fail roughly on each second run with miscellaneous errors like:

Core was generated by `postgres: cloud_admin regression 127.0.0.1(37924) CREATE TABLE                '.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x0000789b1f04527e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x0000789b1f0288ff in __GI_abort () at ./stdlib/abort.c:79
#5  0x000057caecad0aaf in ExceptionalCondition (conditionName=conditionName@entry=0x57caecbdf1ed "false", fileName=fileName@entry=0x57caecb39f70 "/src/neon-t//vendor/postgres-v17/src/backend/access/heap/heapam.c", lineNumber=lineNumber@entry=8443) at /src/neon-t//vendor/postgres-v17/src/backend/utils/error/assert.c:66
#6  0x000057caec63270b in index_delete_sort_cmp (deltid2=<synthetic pointer>, deltid1=<optimized out>) at /src/neon-t//vendor/postgres-v17/src/backend/access/heap/heapam.c:8443
#7  index_delete_sort (delstate=<optimized out>) at /src/neon-t//vendor/postgres-v17/src/backend/access/heap/heapam.c:8484
#8  heap_index_delete_tuples (rel=0x789b1ef23918, delstate=<optimized out>) at /src/neon-t//vendor/postgres-v17/src/backend/access/heap/heapam.c:8141
#9  0x000057caec650671 in table_index_delete_tuples (delstate=0x7fff78cb2b20, rel=0x789b1ef23918) at /src/neon-t//vendor/postgres-v17/src/include/access/tableam.h:1359
#10 _bt_delitems_delete_check (rel=rel@entry=0x789b1eec9bc8, buf=buf@entry=106, heapRel=heapRel@entry=0x789b1ef23918, delstate=delstate@entry=0x7fff78cb2b20) at /src/neon-t//vendor/postgres-v17/src/backend/access/nbtree/nbtpage.c:1526
#11 0x000057caec649985 in _bt_simpledel_pass (maxoff=<optimized out>, minoff=<optimized out>, newitem=<optimized out>, ndeletable=<optimized out>, deletable=0x7fff78cb2b50, heapRel=0x789b1ef23918, buffer=<optimized out>, rel=0x789b1eec9bc8) at /src/neon-t//vendor/postgres-v17/src/backend/access/nbtree/nbtinsert.c:2936
#12 _bt_delete_or_dedup_one_page (rel=0x789b1eec9bc8, heapRel=0x789b1ef23918, insertstate=<optimized out>, simpleonly=false, checkingunique=<optimized out>, uniquedup=<optimized out>, indexUnchanged=false) at /src/neon-t//vendor/postgres-v17/src/backend/access/nbtree/nbtinsert.c:2747
#13 0x000057caec64dc92 in _bt_findinsertloc (heapRel=0x789b1ef23918, stack=0x57cb281f2610, indexUnchanged=false, checkingunique=true, insertstate=0x7fff78cb2f60, rel=0x789b1eec9bc8) at /src/neon-t//vendor/postgres-v17/src/backend/access/nbtree/nbtinsert.c:904

PG:2025-02-20 10:45:17.296 GMT [524697] ERROR:  invalid page in block 20 of relation base/16384/20957
PG:2025-02-20 10:45:17.296 GMT [524697] STATEMENT:  CREATE TABLE sch1.tbl1 (a int) PARTITION BY RANGE(a); 
PG:2025-02-20 10:45:17.296 GMT [524697] ERROR:  relation "sch1.tbl1" does not exist
PG:2025-02-20 10:45:17.296 GMT [524697] STATEMENT:  CREATE TABLE sch1.tbl1_part1 PARTITION OF sch1.tbl1 FOR VALUES FROM (1) to (10);
PG:2025-02-20 10:45:17.297 GMT [524697] ERROR:  relation "sch1.tbl1" does not exist
PG:2025-02-20 10:45:17.297 GMT [524697] STATEMENT:  CREATE TABLE sch1.tbl1_part2 PARTITION OF sch1.tbl1 FOR VALUES FROM (10) to (20);
PG:2025-02-20 10:45:17.314 GMT [524697] ERROR:  [NEON_SMGR] [shard 0, reqid 0] could not read block 20 in rel 1663/16384/20957.0 from page server at lsn 0/08F5C838
PG:2025-02-20 10:45:17.314 GMT [524697] DETAIL:  page server returned error: Read error
PG:2025-02-20 10:45:17.314 GMT [524697] STATEMENT:  CREATE TABLE sch1.tbl1_part3 (a int) PARTITION BY RANGE(a);
PG:2025-02-20 10:45:17.314 GMT [524697] ERROR:  relation "sch1.tbl1" does not exist
PG:2025-02-20 10:45:17.314 GMT [524697] STATEMENT:  ALTER TABLE sch1.tbl1 ATTACH PARTITION sch1.tbl1_part3 FOR VALUES FROM (20) to (30);
PG:2025-02-20 10:45:17.315 GMT [524697] ERROR:  invalid page in block 20 of relation base/16384/20957

TRAP: failed Assert("any_hits"), File: "/pgtest/neon-t//pgxn/neon/pagestore_smgr.c", Line: 1222, PID: 3389989
PG:2025-02-20 10:52:28.166 GMT [3389991:1] [unknown] LOG:  connection received: host=127.0.0.1 port=56408
postgres: autovacuum launcher (+0x1eb990e)[0x6414c70ec90e]
postgres: autovacuum launcher (ExceptionalCondition+0x108)[0x6414c8589938]
/pgtest/neon-t/pg_install/v17/lib/postgresql/neon.so(+0xa81fd)[0x79baa98a81fd]
/pgtest/neon-t/pg_install/v17/lib/postgresql/neon.so(+0xa8c7b)[0x79baa98a8c7b]
/pgtest/neon-t/pg_install/v17/lib/postgresql/neon.so(+0xab62d)[0x79baa98ab62d]
postgres: autovacuum launcher (WaitReadBuffers+0x15db)[0x6414c80004db]
postgres: cloud_admin postgres 127.0.0.1(56404) authentication(+0x1eb990e)[0x6414c70ec90e]
postgres: autovacuum launcher (read_stream_next_buffer+0xb7f)[0x6414c7fefa9f]
postgres: cloud_admin postgres 127.0.0.1(56404) authentication(ExceptionalCondition+0x108)[0x6414c8589938]
/pgtest/neon-t/pg_install/v17/lib/postgresql/neon.so(+0xa81fd)[0x79baa98a81fd]
/pgtest/neon-t/pg_install/v17/lib/postgresql/neon.so(+0xa8c7b)[0x79baa98a8c7b]
/pgtest/neon-t/pg_install/v17/lib/postgresql/neon.so(+0xab62d)[0x79baa98ab62d]

and so on...

knizhnik · 2025-02-20T15:51:09Z

Please try the following modification for test_pg_regress:
@alexanderlaw can you try it now please?
Test is still failed in my case, but the errors are in run_confplay thread which seems to be expected

alexanderlaw · 2025-02-20T17:00:32Z

It fails for me too, but with a more serious error:

PG:2025-02-20 16:52:25.697 GMT [1244850] LOG:  parameter "neon.file_cache_size_limit" changed to "0MB"
PG:2025-02-20 16:52:25.697 GMT [1244858] LOG:  skipping missing configuration file ".../test_output/test_pg_regress[release-pg17-None]/repo/endpoints/ep-1/pgdata/compute_ctl_temp_override.conf"
...
PG:2025-02-20 16:52:25.697 GMT [1244855] LOG:  skipping missing configuration file ".../test_output/test_pg_regress[release-pg17-None]/repo/endpoints/ep-1/pgdata/compute_ctl_temp_override.conf"
PG:2025-02-20 16:52:25.697 GMT [1245124] PANIC:  LFC: assertion LFC_ENABLED() failed at ...//pgxn/neon/file_cache.c:1225:

I'm testing commit db32086.

knizhnik · 2025-02-20T19:40:15Z

One more fix...

alexanderlaw · 2025-02-21T07:17:33Z

Still no luck:

--- .../neon/vendor/postgres-v17/src/test/regress/expected/hash_index.out    2025-02-20 18:00:59.816761549 +0200
+++ .../neon/test_output/test_pg_regress[release-pg17-None]/regress/results/hash_index.out   2025-02-21 09:10:02.521711420 +0200
@@ -40,6 +40,8 @@
 CREATE INDEX hash_txt_index ON hash_txt_heap USING hash (random text_ops);
 CREATE INDEX hash_f8_index ON hash_f8_heap USING hash (random float8_ops)
   WITH (fillfactor=60);
+ERROR:  index "hash_f8_index" contains unexpected zero page at block 37
+HINT:  Please REINDEX it.
 --
 -- Also try building functional, expressional, and partial indexes on
 -- tables that already contain data.
@@ -265,6 +267,8 @@
 -- DELETE, INSERT, VACUUM.
 DELETE FROM hash_split_heap WHERE keycol = 1;
 INSERT INTO hash_split_heap SELECT a/2 FROM generate_series(1, 25000) a;
+ERROR:  index "hash_split_index" contains unexpected zero page at block 111
+HINT:  Please REINDEX it.
 VACUUM hash_split_heap;
 -- Rebuild the index using a different fillfactor
 ALTER INDEX hash_split_index SET (fillfactor = 10);
diff -U3 .../neon/vendor/postgres-v17/src/test/regress/expected/brin.out .../neon/test_output/test_pg_regress[release-pg17-None]/regress/results/brin.out
--- .../neon/vendor/postgres-v17/src/test/regress/expected/brin.out  2025-02-20 18:00:59.811761593 +0200
+++ .../neon/test_output/test_pg_regress[release-pg17-None]/regress/results/brin.out 2025-02-21 09:10:08.813623164 +0200
@@ -363,6 +363,10 @@
        END LOOP;
 END;
 $x$;
+ERROR:  [NEON_SMGR] [shard 0, reqid 0] could not read block 1 in rel 1663/16384/25472.0 from page server at lsn 0/0CD4D090
+DETAIL:  page server returned error: Read error
+CONTEXT:  SQL statement "SELECT array_agg(ctid) FROM brintest WHERE int2col < '1999'::int4 "
+PL/pgSQL function inline_code_block line 52 at EXECUTE
 RESET enable_seqscan;
 RESET enable_bitmapscan;
 INSERT INTO brintest SELECT
...

I wonder if that test survives, say, 10 iterations in your environment?

knizhnik · 2025-02-21T07:42:26Z

I wonder if that test survives, say, 10 iterations in your environment?

Will try.
And at the same moment I have committed another patch - can you check it please?

knizhnik · 2025-02-21T08:19:13Z

Sorry, found one more bug:(

alexanderlaw · 2025-02-21T17:19:21Z

As I understand, changes since 2025-02-13, roughly:

 pgxn/neon/file_cache.c                   | 136 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------------------
 test_runner/regress/test_lfc_prefetch.py |   3 +++

were not reviewed explicitly; wouldn't it make sense to review them now?

skyzh · 2025-02-24T19:34:21Z

The patch seems to regress several test_throughput benchmarks, but I didn't look into why for now :(

skyzh · 2025-02-24T19:49:21Z

The failed assertion:

    assert metrics.pageserver_batch_size_histo_sum == pytest.approx(
        metrics.compute_getpage_count, rel=0.01
    )

which means that compute is prefetching more pages than before? i.e., pageserver receives more requests than compute getpage count

## Problem PR #10442 added prefetch_lookup function. It changed handling of getpage requests in compute. Before: 1. Lookup in LFC (return if found) 2. Register prefetch buffer 3. Wait prefetch result (increment getpage_hist) Now: 1. Lookup prefetch ring (return if prefetch request is already completed) 2. Lookup in LFC (return if found) 3. Register prefetch buffer 4. Wait prefetch result (increment getpage_hist) So if prefetch result is already available, then get page histogram is not incremented. It case failure of some test_throughtput benchmarks: https://neondb.slack.com/archives/C033RQ5SPDH/p1740425527249499 ## Summary of changes Increment getpage histogram in `prefetch_lookup` Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>

knizhnik requested review from a team as code owners January 19, 2025 07:03

knizhnik requested review from ololobus, myrrc, erikgrinaker, hlinnaka and MMeent January 19, 2025 07:03

erikgrinaker reviewed Jan 20, 2025

View reviewed changes

knizhnik force-pushed the prefetch_in_lfc branch from 4c3b6ca to 071142f Compare January 20, 2025 19:09

MMeent mentioned this pull request Jan 23, 2025

Compute/prefetch+LFC: Sideloading for prefetched pages #10492

Draft

MMeent reviewed Jan 27, 2025

View reviewed changes

pgxn/neon/pagestore_smgr.c Outdated Show resolved Hide resolved

pgxn/neon/file_cache.c Show resolved Hide resolved

knizhnik force-pushed the prefetch_in_lfc branch from 071142f to 582953e Compare January 27, 2025 16:33

knizhnik force-pushed the prefetch_in_lfc branch from 15e88f7 to bccf077 Compare January 29, 2025 19:19

knizhnik mentioned this pull request Jan 30, 2025

Prewarm using lfc prefetch #10591

Open

knizhnik force-pushed the prefetch_in_lfc branch from d40a973 to 81c5642 Compare February 7, 2025 07:18

MMeent reviewed Feb 7, 2025

View reviewed changes

pgxn/neon/pagestore_smgr.c Outdated Show resolved Hide resolved

MMeent reviewed Feb 11, 2025

View reviewed changes

pgxn/neon/pagestore_smgr.c Outdated Show resolved Hide resolved

MMeent reviewed Feb 11, 2025

View reviewed changes

pgxn/neon/pagestore_smgr.c Outdated Show resolved Hide resolved

MMeent approved these changes Feb 12, 2025

View reviewed changes

Konstantin Knizhnik added 4 commits February 19, 2025 21:22

Add additional prefetch pump after read

4e4ba8d

More careful checks for LFC state in lfc_change_limit_hook

b45c2a1

Disable statement timeout for test_lfc_prefetch

321ec3a

Make test_lfc_prefetch more deterministic

c0af808

knizhnik force-pushed the prefetch_in_lfc branch from 66fa88a to c0af808 Compare February 19, 2025 19:58

alexanderlaw mentioned this pull request Feb 20, 2025

Persist pg_stat information in pageserver #6560

Merged

Correctly switch-off LFC when file_cache_size_limit is set to 0

db32086

Stop lfc_writev iteration if LFC was disabled

75d4be3

Awlays close LFC file on file_cache_size_limit=0

b32a625

Konstantin Knizhnik added 2 commits February 21, 2025 09:54

Fix condition for fast exit from lfc_change_limit_hook

d875a97

Fix bug in lfc_change_limit_hook

de812fa

Correctly check generationb in LFC

a7ed5d3

knizhnik added this pull request to the merge queue Feb 21, 2025

Merged via the queue into main with commit b1d8771 Feb 21, 2025
91 checks passed

knizhnik deleted the prefetch_in_lfc branch February 21, 2025 16:58

knizhnik mentioned this pull request Feb 25, 2025

Incremenet getpage histogram in prefetch_lookup #10965

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Store prefetch results in LFC cache once as soon as they are received #10442

Store prefetch results in LFC cache once as soon as they are received #10442

knizhnik commented Jan 19, 2025

github-actions bot commented Jan 19, 2025 •

edited

Loading

Postgres 17

Postgres 15

erikgrinaker left a comment •

edited

Loading

MMeent left a comment

knizhnik commented Jan 28, 2025 •

edited

Loading

MMeent left a comment

alexanderlaw commented Feb 19, 2025

alexanderlaw commented Feb 19, 2025 •

edited

Loading

ololobus commented Feb 19, 2025

knizhnik commented Feb 19, 2025

github-actions bot commented Feb 19, 2025

alexanderlaw commented Feb 19, 2025

alexanderlaw commented Feb 19, 2025 •

edited

Loading

alexanderlaw commented Feb 20, 2025 •

edited

Loading

alexanderlaw commented Feb 20, 2025

alexanderlaw commented Feb 20, 2025

knizhnik commented Feb 20, 2025 •

edited

Loading

alexanderlaw commented Feb 20, 2025

knizhnik commented Feb 20, 2025

alexanderlaw commented Feb 21, 2025

knizhnik commented Feb 21, 2025

knizhnik commented Feb 21, 2025

alexanderlaw commented Feb 21, 2025

skyzh commented Feb 24, 2025

skyzh commented Feb 24, 2025

Store prefetch results in LFC cache once as soon as they are received #10442

Store prefetch results in LFC cache once as soon as they are received #10442

Conversation

knizhnik commented Jan 19, 2025

Problem

Summary of changes

github-actions bot commented Jan 19, 2025 • edited Loading

7590 tests run: 7213 passed, 0 failed, 377 skipped (full report)

Postgres 17

Postgres 15

Code coverage* (full report)

erikgrinaker left a comment • edited Loading

Choose a reason for hiding this comment

MMeent left a comment

Choose a reason for hiding this comment

knizhnik commented Jan 28, 2025 • edited Loading

MMeent left a comment

Choose a reason for hiding this comment

alexanderlaw commented Feb 19, 2025

alexanderlaw commented Feb 19, 2025 • edited Loading

ololobus commented Feb 19, 2025

knizhnik commented Feb 19, 2025

github-actions bot commented Feb 19, 2025

alexanderlaw commented Feb 19, 2025

alexanderlaw commented Feb 19, 2025 • edited Loading

alexanderlaw commented Feb 20, 2025 • edited Loading

alexanderlaw commented Feb 20, 2025

alexanderlaw commented Feb 20, 2025

knizhnik commented Feb 20, 2025 • edited Loading

alexanderlaw commented Feb 20, 2025

knizhnik commented Feb 20, 2025

alexanderlaw commented Feb 21, 2025

knizhnik commented Feb 21, 2025

knizhnik commented Feb 21, 2025

alexanderlaw commented Feb 21, 2025

skyzh commented Feb 24, 2025

skyzh commented Feb 24, 2025

github-actions bot commented Jan 19, 2025 •

edited

Loading

erikgrinaker left a comment •

edited

Loading

knizhnik commented Jan 28, 2025 •

edited

Loading

alexanderlaw commented Feb 19, 2025 •

edited

Loading

alexanderlaw commented Feb 19, 2025 •

edited

Loading

alexanderlaw commented Feb 20, 2025 •

edited

Loading

knizhnik commented Feb 20, 2025 •

edited

Loading