Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pd leader changed when injection network partition between pd leader and one of pd follower #9017

Open
Lily2025 opened this issue Jan 21, 2025 · 2 comments
Assignees
Labels
type/bug The issue is confirmed as a bug.

Comments

@Lily2025
Copy link

Lily2025 commented Jan 21, 2025

Bug Report

What did you do?

1、run tpcc
2、inject network partition between pd leader and one of pd follower
2025/01/09 01:40:45.416 +08:00
inject network partition between tc-pd-2 and tc-pd-1 last for 10mins

What did you expect to see?

pd leader should not change

What did you see instead?

pd leader changed

Image

Image

What version of PD are you using (pd-server -V)?

./pd-server -V
Release Version: v9.0.0-alpha-12-gadddd4e
Edition: Community
Git Commit Hash: adddd4e
Git Branch: HEAD
UTC Build Time: 2025-01-08 12:16:08
2025-01-09T01:35:36.393+0800

@Lily2025 Lily2025 added the type/bug The issue is confirmed as a bug. label Jan 21, 2025
@Lily2025
Copy link
Author

/assign rleungx

@rleungx
Copy link
Member

rleungx commented Jan 21, 2025

The leader steps down.
[2025/01/09 01:40:56.588 +08:00] [INFO] [server.go:1749] ["no longer a leader because lease has expired, pd leader will step down"]

Start to stop all background jobs.
[2025/01/09 01:40:58.000 +08:00] [INFO] [cluster.go:2309] ["min resolved ts background jobs has been stopped"]

The follower trigger etcd leader transfer.
[2025/01/09 01:41:39.967 +08:00] [INFO] [raft] [zap_raft.go:77] ["7dc34e2a14bda946 [term 4] starts to transfer leadership to 63dd36822e81aebb"]
[2025/01/09 01:41:39.967 +08:00] [INFO] [raft] [zap_raft.go:77] ["7dc34e2a14bda946 sends MsgTimeoutNow to 63dd36822e81aebb immediately as 63dd36822e81aebb already has up-to-date log"]
[2025/01/09 01:41:39.967 +08:00] [INFO] [raft] [zap_raft.go:77] ["7dc34e2a14bda946 [term: 4] received a MsgVote message with higher term from 63dd36822e81aebb [term: 5]"]
[2025/01/09 01:41:39.967 +08:00] [INFO] [raft] [zap_raft.go:77] ["7dc34e2a14bda946 became follower at term 5"]
[2025/01/09 01:41:39.967 +08:00] [INFO] [raft] [zap_raft.go:77] ["7dc34e2a14bda946 [logterm: 4, index: 8359, vote: 0] cast MsgVote for 63dd36822e81aebb [logterm: 4, index: 8359] at term 5"]
[2025/01/09 01:41:39.967 +08:00] [INFO] [raft] [zap_raft.go:77] ["raft.node: 7dc34e2a14bda946 lost leader 7dc34e2a14bda946 at term 5"]
[2025/01/09 01:41:39.968 +08:00] [INFO] [raft] [zap_raft.go:77] ["raft.node: 7dc34e2a14bda946 elected leader 63dd36822e81aebb at term 5"]

Until 01:46:31, the region syncer exists.
[2025/01/09 01:46:31.331 +08:00] [ERROR] [server.go:353] ["region syncer send data meet error"] [error="[PD:grpc:ErrGRPCSend]send request error: rpc error: code = Unavailable desc = transport is closing"]
[2025/01/09 01:46:31.331 +08:00] [ERROR] [server.go:353] ["region syncer send data meet error"] [error="[PD:grpc:ErrGRPCSend]send request error: rpc error: code = Unavailable desc = transport is closing"]
[2025/01/09 01:46:31.331 +08:00] [INFO] [server.go:362] ["region syncer delete the stream"] [stream=tc-pd-1]
[2025/01/09 01:46:31.331 +08:00] [INFO] [server.go:362] ["region syncer delete the stream"] [stream=tc-pd-0]
[2025/01/09 01:46:31.331 +08:00] [INFO] [server.go:127] ["region syncer has been stopped"]
[2025/01/09 01:46:31.331 +08:00] [INFO] [cluster.go:888] ["raft cluster is stopped"]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug The issue is confirmed as a bug.
Projects
None yet
Development

No branches or pull requests

2 participants