Skip to content

feat: Add CertificateChainSynchronizer and make follower aggregators start their chain by synchronising with their leader #2634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Jul 10, 2025

Conversation

Alenar
Copy link
Collaborator

@Alenar Alenar commented Jul 8, 2025

Content

This PR add a certificate chain synchronization mechanism allowing follower Aggregators to avoid creating genesis certificate by, instead, synchronizing with their leader.

The synchronizer behaves as follows:

  1. It check its force parameter (that allow the state machine to force a synchronization based on a condition, today it forces the sync if the follower chain is invalid), else the sync is done if either:
    • there's no chain yet on the follower aggregator
    • the latest remote genesis certificate has changed (it's different than the stored latest genesis certificate)
  2. If any of the previous is true it proceed by fetching the latest certificate from the remote/leader aggregator
  3. Using a CertificateVerifier wired to the remote aggregator, it fetches the remote certificate chains and validate it at the same time
    • Only the first certificate of each epoch is kept
  4. It stores the fetched certificate in the database using a insert or replace to keep the data up to date
  5. it finish by creating a certified OpenMessage for MithrilStakeDistribution in the database so the aggregator won't create a certificate for this signed entity

Main changes

  • Aggregator:
    • New service: CertificateChainSynchronizer
      • Define several traits:
        • CertificateChainSynchronizer: main trait, define the service api
        • RemoteCertificateRetriever: define how a synchronizer get the latest certificate of the remote (to use it has the starting point of the synchronization) and the latest genesis of the remote (to check if it have changed, triggering a sync)
        • SynchronizedCertificateStorer: define how retrieved certificates are stored and how to retrieve the latest stored genesis certificate of the aggregator
        • OpenMessageStorer: define how the open message created at the end of the process is stored
      • Two implementations:
        • MithrilCertificateChainSynchroniserNoop: allow to construct a synchronizer in the leader aggregator dependency builder
        • MithrilCertificateChainSynchronizer: main implementation
    • state machine and runner: call synchroniser when transitioning from idle to ready if the aggregator is a follower.
    • database, two new queries: InsertOrReplaceCertificateRecordQuery and InsertOrReplaceOpenMessageQuery
    • AggregatorHTTPClient: support three new queries, latest_certificates_list, certificates_details, and latest_genesis_certificate
    • create_certificate_follower integration tests: updated to check the synchronizer instead of spawning a genesis
      certificate on the follower (this simplified the scenario since one less epoch is needed).
  • End to end:
    • only the leader aggregator is started at the infrastructure creation
    • start the follower only after the completion of a "bootstrap" step on the leazder the goes until the creation of the genesis certificate and the delegation of some stakes to the pools (since it was done only on the "leader")

Additional changes

  • fix(aggregator): giving a aggregator_url with a trailing slash to AggregatorHTTPClient would makes all its queries fails
  • refactor(aggregator): Follow DI pattern for the relation of the MithrilSignerRegistrationFollower and the AggregatorHTTPClient by moving and renaming the AggregatorClient trait to it's api defined traits.
  • refactor(aggregator): Store the AggregatorHTTPClient directly as its concrete type in the dependencies builder, this storing it multiple times, one for each of the traits it implement.
  • refactor(aggregator-integration tests): replace TestHttpServer with axum-test as the backend for the leader aggregator http server.
  • refactor(aggregator-integration tests): extract LeaderAggregatorHttpServer to a dedicated module
  • feat(common-CertificateChainBuilder): add latest_certificate to the fixture

Pre-submit checklist

  • Branch
    • Tests are provided (if possible)
    • Crates versions are updated (if relevant)
    • CHANGELOG file is updated (if relevant)
    • Commit sequence broadly makes sense
    • Key commits have useful messages
  • PR
    • All check jobs of the CI have succeeded
    • Self-reviewed the diff
    • Useful pull request description
    • Reviewer requested
  • Documentation
    • No new TODOs introduced

Issue(s)

Relates to #2534

@Alenar Alenar self-assigned this Jul 8, 2025
@Alenar Alenar added the feature 🚀 New implemented feature label Jul 8, 2025
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a certificate chain synchronization mechanism so that follower aggregators can sync their chain from a leader rather than creating a genesis certificate.

  • Introduces a CertificateChainSynchronizer service (and a no-op variant for leaders)
  • Refactors infrastructure startup to prepare and register leader/follower aggregators and serve them in sequence
  • Updates the runtime state machine to validate and (if needed) synchronize the follower’s certificate chain before proceeding

Reviewed Changes

Copilot reviewed 40 out of 41 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
mithril-test-lab/mithril-end-to-end/src/run_only.rs Simplify run flow to bootstrap the leader then serve followers
mithril-test-lab/mithril-end-to-end/src/mithril/infrastructure.rs Refactor aggregator startup to prepare leader/follower and register era
mithril-aggregator/src/services/certificate_chain_synchroniser/synchroniser_service.rs Add certificate chain synchronization logic
mithril-aggregator/src/services/aggregator_client.rs Extend HTTP client to fetch certificate lists and details
mithril-aggregator/src/runtime/state_machine.rs Validate and synchronize follower certificate chain in the runner
Comments suppressed due to low confidence (1)

mithril-test-lab/mithril-end-to-end/src/mithril/infrastructure.rs:318

  • The variable leader_aggregator_endpoint is not defined in this scope; you probably meant to use aggregator_endpoints[0] or introduce leader_aggregator_endpoint before this line.
                aggregator_endpoint: &leader_aggregator_endpoint,

Copy link

github-actions bot commented Jul 8, 2025

Test Results

    4 files  ± 0    154 suites  ±0   22m 53s ⏱️ + 1m 44s
2 118 tests +39  2 118 ✅ +39  0 💤 ±0  0 ❌ ±0 
6 466 runs  +80  6 466 ✅ +80  0 💤 ±0  0 ❌ ±0 

Results for commit 11c8aec. ± Comparison against base commit b7a272c.

♻️ This comment has been updated with latest results.

@Alenar Alenar force-pushed the djo/2534/aggregator-sync-master_follower branch from 72b7d34 to a46e42b Compare July 8, 2025 15:57
Alenar added 21 commits July 8, 2025 17:57
if the leader aggregator url was specified with a trailing slash,
request to it would fails because joined url would have two slashes,
i.e. `http://aggregator//route`.

This align with the behavior of the aggregator client of the
mithril-client lib.
… signer registration follower

several benefits:
- cleaner cut between signer registration business layer and its
  infrastructure
- allow to use the concrete `AggregatorHTTPClient` type in dependency
  injection, allowing to implement multiple traits on it and still pass
  it to the multiple depenedencies that use that trait
…erver

Instead of warp based `mithril-test-http-server`, as this allow to write
simpler, more readable, code.
Nothing is stored yet, this will come afterward
This allow implementor to use transactions if needed.
This is critical since rows are returned from the database by insertion
order reversed.
Alenar added 3 commits July 8, 2025 17:57
… entity only if not synchronized

Has synchronised certificate are created without a signed entity,
previouysly it was only the case for genesis certificates.
@Alenar Alenar force-pushed the djo/2534/aggregator-sync-master_follower branch from a46e42b to 9bebed2 Compare July 8, 2025 15:57
@Alenar Alenar temporarily deployed to testing-preview July 8, 2025 16:32 — with GitHub Actions Inactive
Copy link
Member

@jpraynaud jpraynaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

I left some minor comments.

Copy link
Collaborator

@dlachaume dlachaume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

Copy link
Collaborator

@turmelclem turmelclem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Alenar added 2 commits July 9, 2025 18:07
- fix: synchronizer create the open messages based on the first
  certificate of the last epoch instead of the latest certificate
- move `MockCertificateVerifier` definition to `tools::mocks`
- fix spelling issues
… after sync

This ensure that the synchronizer create an open message for MSD and
that the state machine then proceed to the next signed entity type.
@Alenar Alenar force-pushed the djo/2534/aggregator-sync-master_follower branch from e05637f to d349f0b Compare July 9, 2025 16:08
@Alenar Alenar temporarily deployed to testing-preview July 10, 2025 07:50 — with GitHub Actions Inactive
Alenar added 2 commits July 10, 2025 10:15
* mithril-aggregator from `0.7.72` to `0.7.73`
* mithril-common from `0.6.7` to `0.6.8`
* mithril-end-to-end from `0.4.95` to `0.4.96`
@Alenar Alenar force-pushed the djo/2534/aggregator-sync-master_follower branch from d349f0b to 11c8aec Compare July 10, 2025 08:15
@Alenar Alenar temporarily deployed to testing-preview July 10, 2025 08:29 — with GitHub Actions Inactive
@Alenar Alenar merged commit e52b52a into main Jul 10, 2025
70 of 71 checks passed
@Alenar Alenar deleted the djo/2534/aggregator-sync-master_follower branch July 10, 2025 08:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature 🚀 New implemented feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants