[RIP-44] Support DLedger Controller

### Background

After the release of RocketMQ 4.5.0, the DLedger mode (raft) was introduced. The raft commitlog under this architecture is used to replace the original commitlog so that it has the ability to failover. However, there are some disadvantages going with this architecture due to the raft capability on replication, including:

1. To have failover ability, the number of replicas in the broker group must be 3 or more

2. Acks from replicas need to strictly follow the majority rule of the Raft protocol, that is, 3-replica architecture requires acks from 2 replicas to return, and 5-replica architecture requires acks from 3 to return

3. Since the store repository relies on OpenMessaging DLedger in DLedger mode, Native storage and replication capabilities of RocketMQ (such as transientStorePool and zero-copy capabilities) cannot be reused, and maintenance becomes difficult as well.

To handle those mentioned problems, I would like to start an RIP-44 Support DLedger Controller. With this improvement, DLedger (Raft) capability will be abstracted onto the upper layer, becoming an optional and loosely coupled coordination component named DLedger Controller.

After the deployment of DLedger Controller, the master-slave architecture will also equip with failover capability. The DLedger Controller can optionally be embedded into the NameServer (the NameServer itself remains stateless and cannot provide electoral capabilities when the majority is down), or it can be deployed independently.

DLedger controller is an optional component that does not change the previous operation and maintenance mode. Compared with other components, its downtime will not affect online services. In addition, RIP-44 unifies the storage and replication of RocketMQ, resulting in lower maintenance costs and faster development iterations. In terms of compatibility, the master-slave architecture can upgrade without compatibility problems.

I've already done the work with @RongtongJin . Our proposals are provided at the links below：

https://docs.google.com/document/d/1tSJkor_3Js4NBaVA0UENGyM8Mh0SrRMXszRyI91hjJ8/edit?usp=sharing

Chinese version：

https://shimo.im/docs/N2A1Mz9QZltQZoAD/

### The following prs are the main jobs:

- [x] Add statemachine mode for dledger: https://github.com/openmessaging/dledger/pull/128

- [x] Embed a strongly consistent controller based on dledger in name-srv: https://github.com/apache/rocketmq/pull/4195

- [x] Add a new HaService -- AutoSwitchHAService, which use new log replicating protocol to support switch role in haService level.: https://github.com/apache/rocketmq/pull/4236

- [x] Connecting the interface of Dledger-controller at the Broker level, so that the Broker has the ability of master-slave switching:
https://github.com/apache/rocketmq/pull/4272

- [x] Add learner role, which does not join inSyncStateSet and only asynchronously replicates logs from Master:
https://github.com/apache/rocketmq/pull/4367

### The following prs are for optimization and adjustment

- [x] Make the controller independent from name-srv and can be deployed independently:https://github.com/apache/rocketmq/pull/4333
    - [x] Add a option {isControllerDeployedStandAlone} on the broker side, let broker send heartbeat to contrller:https://github.com/apache/rocketmq/pull/4341

- [x] Modify the definition of syncStateSet in AutoSwitchHASerivce, and introduce the confirmOffset mechanism:https://github.com/apache/rocketmq/pull/4355

- [x] Add admin tools for controler mode (GetSyncStateSet and GetBrokerEpochCahce): https://github.com/apache/rocketmq/pull/4388

- [x] Reuse the remotingServer in dledger: https://github.com/apache/rocketmq/pull/4409

### Document

- [x] document: https://github.com/apache/rocketmq/pull/4413

### Test design

- [x] OpenChaos test: https://shimo.im/docs/0l3NVLR2PdhwJy3R

### RIP42
https://shimo.im/docs/N2A1Mz9QZltQZoAD


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RIP-44] Support DLedger Controller #4330

Background

The following prs are the main jobs:

The following prs are for optimization and adjustment

Document

Test design

RIP42

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RIP-44] Support DLedger Controller #4330

Description

Background

The following prs are the main jobs:

The following prs are for optimization and adjustment

Document

Test design

RIP42

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions