Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store prefetch results in LFC cache once as soon as they are received #10442

Merged
merged 25 commits into from
Feb 21, 2025

Conversation

knizhnik
Copy link
Contributor

Problem

Prefetch is performed locally, so different backers can request the same pages form PS.
Such duplicated request increase load of page server and network traffic.
Making prefetch global seems to be very difficult and undesirable, because different queries can access chunks on different speed. Storing prefetch chunks in LFC will not completely eliminate duplicates, but can minimise such requests.

The problem with storing prefetch result in LFC is that in this case page is not protected by share buffer lock.
So we will have to perform extra synchronisation at LFC side.

See:
https://neondb.slack.com/archives/C0875PUD0LC/p1736772890602029?thread_ts=1736762541.116949&cid=C0875PUD0LC

@MMeent implementation of prewarm:
See #10312

Summary of changes

Use conditional variables to sycnhronize access to LFC entry.

Copy link

github-actions bot commented Jan 19, 2025

7590 tests run: 7213 passed, 0 failed, 377 skipped (full report)


Flaky tests (2)

Postgres 17

Postgres 15

Code coverage* (full report)

  • functions: 32.9% (8617 of 26228 functions)
  • lines: 48.7% (72586 of 149106 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
a7ed5d3 at 2025-02-21T14:42:27.072Z :recycle:

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I've only done a high-level review, but the approach makes sense to me. I expect the synchronization overhead would be negligible since this is on an IO path, but would be interested to see how benchmarks are affected.

Copy link
Contributor

@MMeent MMeent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing the documentation as to why it's OK to side-load the data here without holding a buffer lock.

Could you add that info?

@knizhnik
Copy link
Contributor Author

knizhnik commented Jan 28, 2025

Below are some results illustrating importance of storing prefetch results in LFC.
Certainly it is very artificial workload, but still IMHO quite relevant.

Configuration:

neon.max_file_cache_size=1GB
neon.file_cache_size_limit=1GB
shared_buffers=256kB
random_page_cost=1
effective_io_concurrency=10

Database

create table t(pk integer, sk integer, filter text default repeat('x',200));
insert into values (generate_series(1,1000000),random()*1000000);
create index on t(sk);

Query (range-query.sql):

select sum(sk) from t where sk between 100000 and 100100;

Benchmark

pgbench -n -c 1 -M prepared -T 100 -f range-query.sql

Results:

| --------------------- | --------------- |
| branch                |          TPS    |
| --------------------- | --------------- |
| main                  |             273 |
| prefetch_in_lfc       |            1535 |
| sideload-prefetches   |            1492 |
| --------------------- | --------------- |```

Copy link
Contributor

@MMeent MMeent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last comment: Please coordinate with the release plans for 17.3 this week before merging, so that we have an appropriate amount of time for this to stew on staging before this is released to all customers.

@alexanderlaw
Copy link
Contributor

It looks like the test added would be flaky from the beginning. I tried to run it in parallel and got:

3       =================== 1 failed, 1 warning in 147.28s (0:02:27) ===================
4       =================== 1 failed, 1 warning in 148.79s (0:02:28) ===================
12      =================== 1 failed, 1 warning in 149.77s (0:02:29) ===================
13      =================== 1 failed, 1 warning in 149.77s (0:02:29) ===================
8       =================== 1 failed, 1 warning in 150.34s (0:02:30) ===================
14      =================== 1 failed, 1 warning in 150.32s (0:02:30) ===================
9       =================== 1 failed, 1 warning in 150.29s (0:02:30) ===================
2       =================== 1 failed, 1 warning in 150.29s (0:02:30) ===================
5       =================== 1 failed, 1 warning in 150.75s (0:02:30) ===================
1       =================== 1 failed, 1 warning in 150.77s (0:02:30) ===================
6       =================== 1 failed, 1 warning in 150.76s (0:02:30) ===================
11      =================== 1 failed, 1 warning in 150.76s (0:02:30) ===================
10      =================== 1 failed, 1 warning in 150.75s (0:02:30) ===================
7       =================== 1 failed, 1 warning in 150.75s (0:02:30) ===================
15      =================== 1 failed, 1 warning in 152.32s (0:02:32) ===================

All instances failed like this:

15              cur.execute("create table t(pk integer, sk integer, filler text default repeat('x',200))")
15      >       cur.execute("insert into t values (generate_series(1,1000000),random()*1000000)")
15      E       psycopg2.errors.QueryCanceled: canceling statement due to statement timeout

What do you think of increasing statement_timeout for that insert?

@alexanderlaw
Copy link
Contributor

alexanderlaw commented Feb 19, 2025

Moreover, I could also get the following error:

2       2025-02-19 12:37:06.627 INFO [test_lfc_prefetch_2.py:48] Unused prefetches: 0
2       2025-02-19 12:37:06.920 INFO [test_lfc_prefetch_2.py:54] Unused prefetches: 0
2       2025-02-19 12:37:07.314 INFO [test_lfc_prefetch_2.py:60] Unused prefetches: 31
2       2025-02-19 12:37:07.450 INFO [test_lfc_prefetch_2.py:66] Unused prefetches: 0

2               # if prefetch requests are not stored in LFC, we continue to sent unused prefetch request tyo PS
2       >       assert prefetch_expired > 0
2       E       assert 0 > 0

Or with the complete plan:

    2025-02-19 12:49:45.844 INFO [test_lfc_prefetch_5.py:70] prefetch_expired: 0
...
    cur.execute(
        "explain (analyze,prefetch,format json) select sum(pk) from (select pk from t where sk between 100000 and 200000 limit 100) s4"
    )
    prefetch_expired = cr = cur.fetchall()[0][0][0]["Plan"]["Prefetch Expired Requests"]

    >       assert prefetch_expired > 0
    E       assert 0 > 0
{
   "Plan":{
      "Node Type":"Aggregate",
      "Strategy":"Plain",
      "Partial Mode":"Simple",
      "Parallel Aware":false,
      "Async Capable":false,
      "Startup Cost":373.24,
      "Total Cost":373.25,
      "Plan Rows":1,
      "Plan Width":8,
      "Actual Startup Time":9.122,
      "Actual Total Time":9.125,
      "Actual Rows":1,
      "Actual Loops":1,
      "Prefetch Hits":0,
      "Prefetch Misses":0,
      "Prefetch Expired Requests":0,
      "Prefetch Duplicated Requests":0,
      "Plans":[
         {
            "Node Type":"Limit",
            "Parent Relationship":"Outer",
            "Parallel Aware":false,
            "Async Capable":false,
            "Startup Cost":0.42,
            "Total Cost":372.99,
            "Plan Rows":100,
            "Plan Width":4,
            "Actual Startup Time":0.156,
            "Actual Total Time":9.054,
            "Actual Rows":100,
            "Actual Loops":1,
            "Prefetch Hits":0,
            "Prefetch Misses":0,
            "Prefetch Expired Requests":0,
            "Prefetch Duplicated Requests":0,
            "Plans":[
               {
                  "Node Type":"Index Scan",
                  "Parent Relationship":"Outer",
                  "Parallel Aware":false,
                  "Async Capable":false,
                  "Scan Direction":"Forward",
                  "Index Name":"t_sk_idx",
                  "Relation Name":"t",
                  "Alias":"t",
                  "Startup Cost":0.42,
                  "Total Cost":18628.42,
                  "Plan Rows":5000,
                  "Plan Width":4,
                  "Actual Startup Time":0.153,
                  "Actual Total Time":9.017,
                  "Actual Rows":100,
                  "Actual Loops":1,
                  "Index Cond":"((sk >= 100000) AND (sk <= 200000))",
                  "Rows Removed by Index Recheck":0,
                  "Prefetch Hits":0,
                  "Prefetch Misses":0,
                  "Prefetch Expired Requests":0,
                  "Prefetch Duplicated Requests":0
               }
            ]
         }
      ]
   },
   "Planning Time":0.235,
   "Triggers":[
      
   ],
   "Execution Time":9.21
}

Wouldn't it make sense to debug the new test before merging this PR?

@ololobus
Copy link
Member

Wouldn't it make sense to debug the new test before merging this PR?

+1, thanks for looking at it. @knizhnik can you please debug test failures?

@knizhnik
Copy link
Contributor Author

@alexanderlaw - I wonder how are you running this test in parallel? Are there N independent Neon instances?

Copy link

If this PR added a GUC in the Postgres fork or neon extension,
please regenerate the Postgres settings in the cloud repo:

make NEON_WORKDIR=path/to/neon/checkout \
  -C goapp/internal/shareddomain/postgres generate

If you're an external contributor, a Neon employee will assist in
making sure this step is done.

@alexanderlaw
Copy link
Contributor

Running this test in a loop and 5 instances in parallel, I can see the following numbers for the first 4 queries (with grep "Unused prefetches:") when it passes:

1       2025-02-19 17:06:30.244 INFO [test_lfc_prefetch_1.py:49] Unused prefetches: 0
1       2025-02-19 17:06:30.656 INFO [test_lfc_prefetch_1.py:55] Unused prefetches: 30
1       2025-02-19 17:06:30.803 INFO [test_lfc_prefetch_1.py:61] Unused prefetches: 51
1       2025-02-19 17:06:30.898 INFO [test_lfc_prefetch_1.py:68] Unused prefetches: 94
...

2       2025-02-19 17:06:28.909 INFO [test_lfc_prefetch_2.py:49] Unused prefetches: 0
2       2025-02-19 17:06:29.284 INFO [test_lfc_prefetch_2.py:55] Unused prefetches: 74
2       2025-02-19 17:06:29.735 INFO [test_lfc_prefetch_2.py:61] Unused prefetches: 90
2       2025-02-19 17:06:29.894 INFO [test_lfc_prefetch_2.py:68] Unused prefetches: 93
...

3       2025-02-19 17:06:28.121 INFO [test_lfc_prefetch_3.py:49] Unused prefetches: 0
3       2025-02-19 17:06:28.711 INFO [test_lfc_prefetch_3.py:55] Unused prefetches: 70
3       2025-02-19 17:06:29.132 INFO [test_lfc_prefetch_3.py:61] Unused prefetches: 97
3       2025-02-19 17:06:29.231 INFO [test_lfc_prefetch_3.py:68] Unused prefetches: 95
...

4       2025-02-19 17:06:28.472 INFO [test_lfc_prefetch_4.py:49] Unused prefetches: 0
4       2025-02-19 17:06:29.24  INFO [test_lfc_prefetch_4.py:55] Unused prefetches: 69
4       2025-02-19 17:06:29.369 INFO [test_lfc_prefetch_4.py:61] Unused prefetches: 97
4       2025-02-19 17:06:29.542 INFO [test_lfc_prefetch_4.py:68] Unused prefetches: 95
...

5       2025-02-19 17:06:29.483 INFO [test_lfc_prefetch_5.py:49] Unused prefetches: 0
5       2025-02-19 17:06:29.957 INFO [test_lfc_prefetch_5.py:55] Unused prefetches: 70
5       2025-02-19 17:06:30.498 INFO [test_lfc_prefetch_5.py:61] Unused prefetches: 75
5       2025-02-19 17:06:30.525 INFO [test_lfc_prefetch_5.py:68] Unused prefetches: 67
...

2       2025-02-19 17:33:38.28  INFO [test_lfc_prefetch_2.py:49] Unused prefetches: 0
2       2025-02-19 17:33:38.451 INFO [test_lfc_prefetch_2.py:55] Unused prefetches: 3
2       2025-02-19 17:33:39.208 INFO [test_lfc_prefetch_2.py:61] Unused prefetches: 98
2       2025-02-19 17:33:39.418 INFO [test_lfc_prefetch_2.py:68] Unused prefetches: 26

But when it fails, the numbers are:

2       2025-02-19 17:08:20.267 INFO [test_lfc_prefetch_2.py:49] Unused prefetches: 0
2       2025-02-19 17:08:20.866 INFO [test_lfc_prefetch_2.py:55] Unused prefetches: 0
2       2025-02-19 17:08:21.308 INFO [test_lfc_prefetch_2.py:61] Unused prefetches: 0
2       2025-02-19 17:08:21.328 INFO [test_lfc_prefetch_2.py:68] Unused prefetches: 0
...

5       2025-02-19 17:18:11.59  INFO [test_lfc_prefetch_5.py:49] Unused prefetches: 0
5       2025-02-19 17:18:11.198 INFO [test_lfc_prefetch_5.py:55] Unused prefetches: 0
5       2025-02-19 17:18:11.375 INFO [test_lfc_prefetch_5.py:61] Unused prefetches: 68
5       2025-02-19 17:18:11.379 INFO [test_lfc_prefetch_5.py:68] Unused prefetches: 0
...

3       2025-02-19 17:23:31.454 INFO [test_lfc_prefetch_3.py:49] Unused prefetches: 0
3       2025-02-19 17:23:31.693 INFO [test_lfc_prefetch_3.py:55] Unused prefetches: 14
3       2025-02-19 17:23:31.960 INFO [test_lfc_prefetch_3.py:61] Unused prefetches: 32
3       2025-02-19 17:23:31.967 INFO [test_lfc_prefetch_3.py:68] Unused prefetches: 0
...

@alexanderlaw
Copy link
Contributor

alexanderlaw commented Feb 19, 2025

@alexanderlaw - I wonder how are you running this test in parallel? Are there N independent Neon instances?

I prefer to use something like:

sed 's/\(BASE_PORT: int = 15000\)/\1 + 1000 * int(os.environ.get("PYTEST_JOB_NUMBER", "0"))/' -i test_runner/fixtures/neon_fixtures.py

for i in `seq 20`; do cp test_runner/regress/test_lfc_prefetch.py test_runner/regress/test_lfc_prefetch_$i.py; sed "s/test_lfc_prefetch(/test_lfc_prefetch_$i(/;" -i test_runner/regress/test_lfc_prefetch_$i.py; done

for i in `seq 30`; do echo "ITERATION $i"; date; parallel -j5 --linebuffer --tag POETRY_VIRTUALENVS_PATH=virtualenvs DEFAULT_PG_VERSION=17 BUILD_TYPE=debug PYTEST_JOB_NUMBER={} poetry run pytest test_runner/regress/test_lfc_prefetch_{}.py::test_lfc_prefetch_{}[debug-pg17] --  ::: `seq 5` >test.log 2>&1 || break; grep -E '[0-9]+ (pass|fail)' test.log; done

So I have N independent pytest and neon instances.

But the same effect wrt this test can also be achieved with just:

@@ -10,10 +10,12 @@ from fixtures.utils import USE_LFC
 
 @pytest.mark.timeout(600)
 @pytest.mark.skipif(not USE_LFC, reason="LFC is disabled, skipping")
-def test_lfc_prefetch(neon_simple_env: NeonEnv):
+@pytest.mark.parametrize("n", range(5))
+def test_lfc_prefetch(neon_simple_env: NeonEnv, n: int):
     """
     Test resizing the Local File Cache
     """
+    _ = n
for i in `seq 30`; do echo "ITERATION $i"; date; POETRY_VIRTUALENVS_PATH=virtualenvs BUILD_TYPE=debug DEFAULT_PG_VERSION=17 poetry run pytest -n8 test_runner/regress/test_lfc_prefetch.py  >test.log 2>&1 || break; done

The above recipe produces for me:

ITERATION 2
...
=================================== FAILURES ===================================
_______________________ test_lfc_prefetch[debug-pg17-2] ________________________
...
        # if prefetch requests are not stored in LFC, we continue to sent unused prefetch request tyo PS
>       assert prefetch_expired > 0
E       assert 0 > 0

@alexanderlaw
Copy link
Contributor

alexanderlaw commented Feb 20, 2025

Commit c0af808 makes the test stable:

1       2025-02-20 08:09:24.352 INFO [test_lfc_prefetch_1.py:51] Unused prefetches: 0
1       2025-02-20 08:09:24.711 INFO [test_lfc_prefetch_1.py:57] Unused prefetches: 31
1       2025-02-20 08:09:24.873 INFO [test_lfc_prefetch_1.py:63] Unused prefetches: 32
1       2025-02-20 08:09:24.912 INFO [test_lfc_prefetch_1.py:69] Unused prefetches: 71
1       2025-02-20 08:09:25.197 INFO [test_lfc_prefetch_1.py:80] Unused prefetches: 97
1       2025-02-20 08:09:25.527 INFO [test_lfc_prefetch_1.py:86] Unused prefetches: 95
1       2025-02-20 08:09:25.831 INFO [test_lfc_prefetch_1.py:92] Unused prefetches: 97
1       2025-02-20 08:09:25.838 INFO [test_lfc_prefetch_1.py:98] Unused prefetches: 0

2       2025-02-20 08:09:27.942 INFO [test_lfc_prefetch_2.py:51] Unused prefetches: 0
2       2025-02-20 08:09:28.206 INFO [test_lfc_prefetch_2.py:57] Unused prefetches: 31
2       2025-02-20 08:09:28.292 INFO [test_lfc_prefetch_2.py:63] Unused prefetches: 32
2       2025-02-20 08:09:28.338 INFO [test_lfc_prefetch_2.py:69] Unused prefetches: 71
2       2025-02-20 08:09:28.578 INFO [test_lfc_prefetch_2.py:80] Unused prefetches: 97
2       2025-02-20 08:09:28.708 INFO [test_lfc_prefetch_2.py:86] Unused prefetches: 95
2       2025-02-20 08:09:28.937 INFO [test_lfc_prefetch_2.py:92] Unused prefetches: 97
2       2025-02-20 08:09:28.940 INFO [test_lfc_prefetch_2.py:98] Unused prefetches: 0

3       2025-02-20 08:09:25.725 INFO [test_lfc_prefetch_3.py:51] Unused prefetches: 0
3       2025-02-20 08:09:25.984 INFO [test_lfc_prefetch_3.py:57] Unused prefetches: 31
3       2025-02-20 08:09:26.122 INFO [test_lfc_prefetch_3.py:63] Unused prefetches: 32
3       2025-02-20 08:09:26.143 INFO [test_lfc_prefetch_3.py:69] Unused prefetches: 71
3       2025-02-20 08:09:26.369 INFO [test_lfc_prefetch_3.py:80] Unused prefetches: 97
3       2025-02-20 08:09:26.528 INFO [test_lfc_prefetch_3.py:86] Unused prefetches: 95
3       2025-02-20 08:09:26.740 INFO [test_lfc_prefetch_3.py:92] Unused prefetches: 97
3       2025-02-20 08:09:26.746 INFO [test_lfc_prefetch_3.py:98] Unused prefetches: 0

Thank you!

@alexanderlaw
Copy link
Contributor

I think, I've discovered another anomaly related to this PR, investigating...

@alexanderlaw
Copy link
Contributor

Please try the following modification for test_pg_regress:
pg_regress-with-LFC-resize-on-fly.patch.txt

It makes test_runner/regress/test_pg_regress.py::test_pg_regress[release-pg17-None] or [debug-...] fail roughly on each second run with miscellaneous errors like:

Core was generated by `postgres: cloud_admin regression 127.0.0.1(37924) CREATE TABLE                '.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x0000789b1f04527e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x0000789b1f0288ff in __GI_abort () at ./stdlib/abort.c:79
#5  0x000057caecad0aaf in ExceptionalCondition (conditionName=conditionName@entry=0x57caecbdf1ed "false", fileName=fileName@entry=0x57caecb39f70 "/src/neon-t//vendor/postgres-v17/src/backend/access/heap/heapam.c", lineNumber=lineNumber@entry=8443) at /src/neon-t//vendor/postgres-v17/src/backend/utils/error/assert.c:66
#6  0x000057caec63270b in index_delete_sort_cmp (deltid2=<synthetic pointer>, deltid1=<optimized out>) at /src/neon-t//vendor/postgres-v17/src/backend/access/heap/heapam.c:8443
#7  index_delete_sort (delstate=<optimized out>) at /src/neon-t//vendor/postgres-v17/src/backend/access/heap/heapam.c:8484
#8  heap_index_delete_tuples (rel=0x789b1ef23918, delstate=<optimized out>) at /src/neon-t//vendor/postgres-v17/src/backend/access/heap/heapam.c:8141
#9  0x000057caec650671 in table_index_delete_tuples (delstate=0x7fff78cb2b20, rel=0x789b1ef23918) at /src/neon-t//vendor/postgres-v17/src/include/access/tableam.h:1359
#10 _bt_delitems_delete_check (rel=rel@entry=0x789b1eec9bc8, buf=buf@entry=106, heapRel=heapRel@entry=0x789b1ef23918, delstate=delstate@entry=0x7fff78cb2b20) at /src/neon-t//vendor/postgres-v17/src/backend/access/nbtree/nbtpage.c:1526
#11 0x000057caec649985 in _bt_simpledel_pass (maxoff=<optimized out>, minoff=<optimized out>, newitem=<optimized out>, ndeletable=<optimized out>, deletable=0x7fff78cb2b50, heapRel=0x789b1ef23918, buffer=<optimized out>, rel=0x789b1eec9bc8) at /src/neon-t//vendor/postgres-v17/src/backend/access/nbtree/nbtinsert.c:2936
#12 _bt_delete_or_dedup_one_page (rel=0x789b1eec9bc8, heapRel=0x789b1ef23918, insertstate=<optimized out>, simpleonly=false, checkingunique=<optimized out>, uniquedup=<optimized out>, indexUnchanged=false) at /src/neon-t//vendor/postgres-v17/src/backend/access/nbtree/nbtinsert.c:2747
#13 0x000057caec64dc92 in _bt_findinsertloc (heapRel=0x789b1ef23918, stack=0x57cb281f2610, indexUnchanged=false, checkingunique=true, insertstate=0x7fff78cb2f60, rel=0x789b1eec9bc8) at /src/neon-t//vendor/postgres-v17/src/backend/access/nbtree/nbtinsert.c:904
PG:2025-02-20 10:45:17.296 GMT [524697] ERROR:  invalid page in block 20 of relation base/16384/20957
PG:2025-02-20 10:45:17.296 GMT [524697] STATEMENT:  CREATE TABLE sch1.tbl1 (a int) PARTITION BY RANGE(a); 
PG:2025-02-20 10:45:17.296 GMT [524697] ERROR:  relation "sch1.tbl1" does not exist
PG:2025-02-20 10:45:17.296 GMT [524697] STATEMENT:  CREATE TABLE sch1.tbl1_part1 PARTITION OF sch1.tbl1 FOR VALUES FROM (1) to (10);
PG:2025-02-20 10:45:17.297 GMT [524697] ERROR:  relation "sch1.tbl1" does not exist
PG:2025-02-20 10:45:17.297 GMT [524697] STATEMENT:  CREATE TABLE sch1.tbl1_part2 PARTITION OF sch1.tbl1 FOR VALUES FROM (10) to (20);
PG:2025-02-20 10:45:17.314 GMT [524697] ERROR:  [NEON_SMGR] [shard 0, reqid 0] could not read block 20 in rel 1663/16384/20957.0 from page server at lsn 0/08F5C838
PG:2025-02-20 10:45:17.314 GMT [524697] DETAIL:  page server returned error: Read error
PG:2025-02-20 10:45:17.314 GMT [524697] STATEMENT:  CREATE TABLE sch1.tbl1_part3 (a int) PARTITION BY RANGE(a);
PG:2025-02-20 10:45:17.314 GMT [524697] ERROR:  relation "sch1.tbl1" does not exist
PG:2025-02-20 10:45:17.314 GMT [524697] STATEMENT:  ALTER TABLE sch1.tbl1 ATTACH PARTITION sch1.tbl1_part3 FOR VALUES FROM (20) to (30);
PG:2025-02-20 10:45:17.315 GMT [524697] ERROR:  invalid page in block 20 of relation base/16384/20957
TRAP: failed Assert("any_hits"), File: "/pgtest/neon-t//pgxn/neon/pagestore_smgr.c", Line: 1222, PID: 3389989
PG:2025-02-20 10:52:28.166 GMT [3389991:1] [unknown] LOG:  connection received: host=127.0.0.1 port=56408
postgres: autovacuum launcher (+0x1eb990e)[0x6414c70ec90e]
postgres: autovacuum launcher (ExceptionalCondition+0x108)[0x6414c8589938]
/pgtest/neon-t/pg_install/v17/lib/postgresql/neon.so(+0xa81fd)[0x79baa98a81fd]
/pgtest/neon-t/pg_install/v17/lib/postgresql/neon.so(+0xa8c7b)[0x79baa98a8c7b]
/pgtest/neon-t/pg_install/v17/lib/postgresql/neon.so(+0xab62d)[0x79baa98ab62d]
postgres: autovacuum launcher (WaitReadBuffers+0x15db)[0x6414c80004db]
postgres: cloud_admin postgres 127.0.0.1(56404) authentication(+0x1eb990e)[0x6414c70ec90e]
postgres: autovacuum launcher (read_stream_next_buffer+0xb7f)[0x6414c7fefa9f]
postgres: cloud_admin postgres 127.0.0.1(56404) authentication(ExceptionalCondition+0x108)[0x6414c8589938]
/pgtest/neon-t/pg_install/v17/lib/postgresql/neon.so(+0xa81fd)[0x79baa98a81fd]
/pgtest/neon-t/pg_install/v17/lib/postgresql/neon.so(+0xa8c7b)[0x79baa98a8c7b]
/pgtest/neon-t/pg_install/v17/lib/postgresql/neon.so(+0xab62d)[0x79baa98ab62d]

and so on...

@knizhnik
Copy link
Contributor Author

knizhnik commented Feb 20, 2025

Please try the following modification for test_pg_regress:
@alexanderlaw can you try it now please?
Test is still failed in my case, but the errors are in run_confplay thread which seems to be expected

@alexanderlaw
Copy link
Contributor

It fails for me too, but with a more serious error:

PG:2025-02-20 16:52:25.697 GMT [1244850] LOG:  parameter "neon.file_cache_size_limit" changed to "0MB"
PG:2025-02-20 16:52:25.697 GMT [1244858] LOG:  skipping missing configuration file ".../test_output/test_pg_regress[release-pg17-None]/repo/endpoints/ep-1/pgdata/compute_ctl_temp_override.conf"
...
PG:2025-02-20 16:52:25.697 GMT [1244855] LOG:  skipping missing configuration file ".../test_output/test_pg_regress[release-pg17-None]/repo/endpoints/ep-1/pgdata/compute_ctl_temp_override.conf"
PG:2025-02-20 16:52:25.697 GMT [1245124] PANIC:  LFC: assertion LFC_ENABLED() failed at ...//pgxn/neon/file_cache.c:1225:

I'm testing commit db32086.

@knizhnik
Copy link
Contributor Author

One more fix...

@alexanderlaw
Copy link
Contributor

Still no luck:

--- .../neon/vendor/postgres-v17/src/test/regress/expected/hash_index.out    2025-02-20 18:00:59.816761549 +0200
+++ .../neon/test_output/test_pg_regress[release-pg17-None]/regress/results/hash_index.out   2025-02-21 09:10:02.521711420 +0200
@@ -40,6 +40,8 @@
 CREATE INDEX hash_txt_index ON hash_txt_heap USING hash (random text_ops);
 CREATE INDEX hash_f8_index ON hash_f8_heap USING hash (random float8_ops)
   WITH (fillfactor=60);
+ERROR:  index "hash_f8_index" contains unexpected zero page at block 37
+HINT:  Please REINDEX it.
 --
 -- Also try building functional, expressional, and partial indexes on
 -- tables that already contain data.
@@ -265,6 +267,8 @@
 -- DELETE, INSERT, VACUUM.
 DELETE FROM hash_split_heap WHERE keycol = 1;
 INSERT INTO hash_split_heap SELECT a/2 FROM generate_series(1, 25000) a;
+ERROR:  index "hash_split_index" contains unexpected zero page at block 111
+HINT:  Please REINDEX it.
 VACUUM hash_split_heap;
 -- Rebuild the index using a different fillfactor
 ALTER INDEX hash_split_index SET (fillfactor = 10);
diff -U3 .../neon/vendor/postgres-v17/src/test/regress/expected/brin.out .../neon/test_output/test_pg_regress[release-pg17-None]/regress/results/brin.out
--- .../neon/vendor/postgres-v17/src/test/regress/expected/brin.out  2025-02-20 18:00:59.811761593 +0200
+++ .../neon/test_output/test_pg_regress[release-pg17-None]/regress/results/brin.out 2025-02-21 09:10:08.813623164 +0200
@@ -363,6 +363,10 @@
        END LOOP;
 END;
 $x$;
+ERROR:  [NEON_SMGR] [shard 0, reqid 0] could not read block 1 in rel 1663/16384/25472.0 from page server at lsn 0/0CD4D090
+DETAIL:  page server returned error: Read error
+CONTEXT:  SQL statement "SELECT array_agg(ctid) FROM brintest WHERE int2col < '1999'::int4 "
+PL/pgSQL function inline_code_block line 52 at EXECUTE
 RESET enable_seqscan;
 RESET enable_bitmapscan;
 INSERT INTO brintest SELECT
...

I wonder if that test survives, say, 10 iterations in your environment?

@knizhnik
Copy link
Contributor Author

I wonder if that test survives, say, 10 iterations in your environment?

Will try.
And at the same moment I have committed another patch - can you check it please?

@knizhnik
Copy link
Contributor Author

Sorry, found one more bug:(

@knizhnik knizhnik added this pull request to the merge queue Feb 21, 2025
Merged via the queue into main with commit b1d8771 Feb 21, 2025
91 checks passed
@knizhnik knizhnik deleted the prefetch_in_lfc branch February 21, 2025 16:58
@alexanderlaw
Copy link
Contributor

As I understand, changes since 2025-02-13, roughly:

 pgxn/neon/file_cache.c                   | 136 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------------------
 test_runner/regress/test_lfc_prefetch.py |   3 +++

were not reviewed explicitly; wouldn't it make sense to review them now?

@skyzh
Copy link
Member

skyzh commented Feb 24, 2025

The patch seems to regress several test_throughput benchmarks, but I didn't look into why for now :(

@skyzh
Copy link
Member

skyzh commented Feb 24, 2025

The failed assertion:

    assert metrics.pageserver_batch_size_histo_sum == pytest.approx(
        metrics.compute_getpage_count, rel=0.01
    )

which means that compute is prefetching more pages than before? i.e., pageserver receives more requests than compute getpage count

github-merge-queue bot pushed a commit that referenced this pull request Feb 25, 2025
## Problem

PR #10442 added prefetch_lookup
function.
It changed handling of getpage requests in compute.

Before:
1. Lookup in LFC (return if found)
2. Register prefetch buffer
3. Wait prefetch result (increment getpage_hist)
Now:

1. Lookup prefetch ring (return if prefetch request is already
completed)
2. Lookup in LFC (return if found)
3. Register prefetch buffer
4. Wait prefetch result (increment getpage_hist)

So if prefetch result is already available, then get page histogram is
not incremented.
It case failure of some test_throughtput benchmarks:
https://neondb.slack.com/archives/C033RQ5SPDH/p1740425527249499

## Summary of changes

Increment getpage histogram in `prefetch_lookup`

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants