Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not download the partition manifest when creating learner #24664

Merged

Conversation

mmaslankaprv
Copy link
Member

@mmaslankaprv mmaslankaprv commented Jan 2, 2025

When learner replica is created its state should not be seeded by the
recovery machinery but rather than that it must be recovered by Raft.
Using remote recovery when creating the learner replicas may lead to
archival_stm issues as some commands are applied more than once.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.3.x
  • v24.2.x
  • v24.1.x

Release Notes

Bug Fixes

  • Fixes a bug which may lead to archival_metadata_stm inconsistencies when reconfiguring clusters with recovered compacted topics.

When learner replica is created its state should not be seeded by the
recovery machinery but rather than that it must be recovered by Raft.
Using remote recovery when creating the learner replicas may lead to
archival_stm issues as some commands are applied more than once.

Signed-off-by: Michał Maślanka <michal@redpanda.com>
Added tests triggering partition replica set reconfiguration with
topics which are recovered.

Signed-off-by: Michał Maślanka <michal@redpanda.com>
@vbotbuildovich
Copy link
Collaborator

Retry command for Build#60220

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/tests/controller_log_limiting_test.py::ControllerLogLimitMirrorMakerTests.test_mirror_maker_with_limits

@vbotbuildovich
Copy link
Collaborator

CI test results

test results on build#60220
test_id test_kind job_url test_status passed
rptest.tests.controller_log_limiting_test.ControllerLogLimitMirrorMakerTests.test_mirror_maker_with_limits ducktape https://buildkite.com/redpanda/redpanda/builds/60220#019427a0-b097-4f9a-9deb-a69a4261cab9 FAIL 0/1
rptest.tests.datalake.partition_movement_test.PartitionMovementTest.test_cross_core_movements.cloud_storage_type=CloudStorageType.S3 ducktape https://buildkite.com/redpanda/redpanda/builds/60220#019427a0-b09a-493a-b567-4b1b8f7d545b FLAKY 2/6

self.redpanda.start_node(self.redpanda.nodes[4])
self.wait_for_partitions_rebalanced(total_replicas=total_replicas,
timeout_sec=self.rebalance_timeout)
self.redpanda.wait_for_manifest_uploads()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we wait for this before adding nodes? (well, I guess both cases are interesting)

@mmaslankaprv mmaslankaprv merged commit 5571870 into redpanda-data:dev Jan 3, 2025
19 of 22 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v24.3.x

@vbotbuildovich
Copy link
Collaborator

/backport v24.2.x

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v24.2.x branch. I tried:

git remote add upstream /~https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-24664-v24.2.x-613 remotes/upstream/v24.2.x
git cherry-pick -x 481b47b62e fff2b724a1

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants