Improving Aztec Block Production

TL;DR

This article summarizes the takeaways from a recent Nethermind exploration into the future for Aztec’s L2 preconfirmation layer. We outline some limitations in the current designs, and consider two viable solution spaces for addressing these limitations, one without consensus, and the other with consensus. We conclude by laying out some of the open questions that we are thinking about. Thanks to Lin and Stefano for their insightful reviews.

The State of Affairs (Current Protocol Direction)

Right now, the proposer schedule is tied to L1 slots. If a proposer misses their window to publish their block to L1, that L2 slot is considered missed and the proceeding proposer builds off the last L2 block to be published to L1.

Protocol roles

  • Proposer: elected pseudorandomly from the set of validators, assigned a unique L2 slot in which they have exclusive rights to propose a block.

  • Attestor/Attestor committee: elected pseudorandomly from the set of validators, required to attest to the validity and DA of L2 blocks.

Block-production designs under consideration

Two related non-exclusive designs are being considered:

  • **Checkpoints / Building-in-Chunks**: An Aztec proposer can build multiple L2 (sub-)blocks per L2 slot, receive attestations for the final block, and publish all of these blocks together to L1 in what will be known as a checkpoint on Aztec.

  • Building Ahead (post-alpha consideration): A proposer’s proposal slot, wherein attestors will attest to a proposed block, is earlier than the corresponding slot for publishing the block to L1. If a proposer for slot N proposes and receives attestations for a block, the proposer for slot N+1 is expected (potentially required) to propose a block building on proposer N’s block. This has two potential outcomes:

    • Optimistic — Proposer N’s block is published to L1: In this scenario, proposer N+1 should receive attestations for their own block and have their corresponding L1-publication slot to publish to L1.

    • Pessimistic — Proposer N fails to publish their block to L1: In this case, proposer N+1 would propose and receives attestations for their block building on proposer N’s block. However, proposer N+1 would be unable to publish their block to L1, given the L1 contracts needed proposer N’s block to be published by proposer N during their publication slot.

Properties we want from Aztec’s Preconfirmation layer

The following are a non-exhaustive list of properties that we believe Aztec needs from any L2 preconfirmation (before confirmation on L1) layer

  • Rotating permissionless proposers to avoid permissioned points-of-failure — particularly important for privacy-focused chains where traditional centralized entities sequencing the chain may be unwanted points of failure. Unfortunately, this utilization of permissionless proposers from day 1 removes a common training wheel for new ZK rollups. These proposers, left to their own devices, would have little to no accountability for what gets proposed. As such, we need an additional line of defense to protect against an attacker who identifies an exploit to prove an invalid block. In Aztec, this defense is a committee of attestors who must attest to the validity of blocks before they can be published and proven on L1. This anchors our design space to L2 preconfirmation protocols that utilize a committee attesting to the validity and data availability of L2 blocks — training wheels for a rollup utilizing a novel proof system.

  • The ability to optimize L2 transaction batch posting to L1 — avoiding race conditions, minimizing costs. This property has levels. At the extreme, we have an L2 where the batch posters are trusted to always eventually post batches to L1, but without a strict time constraint enforced on L1. Building Ahead in the current Aztec design pipeline somewhat decouples proposing from batch posting— proposers propose blocks before the window for publishing them to L1 open. However there is still a tight time window constraint on each proposer, which limits batch posting optimizations.

  • Shorter L2 slot times — permissionless proposers on which Aztec is planning to depend are inevitably going to try to extract maximum value from the proposing blocks. The longer a single proposer has monopoly on proposing, the worse it is for the chain. As it stands, Aztec allocates 72 seconds to each L2 proposer to give them time to post blocks to L1. If we can fully decouple L2 proposing from batch posting, we can reduce proposal window times to the actual time it takes to build, propose, propagate, and collect attestations for the block to the attestor committee (closer to 12 seconds).

  • Minimal proposer downtime between slots e.g. proposers not waiting for the previous proposers’ block on L1 before building their own block — the optimistic path for Building Ahead is a step.towards this, where proposers can build on previous proposals before the blocks appear on L1.

  • Reliable L2 preconfirmations for applications before blocks arrive on L1. As in the optimizing block posting to L1 property, the ideal case here is an L2 preconfirmation layer where all L2 preconfirmations are trusted to eventually arrive on L1, be proven and subsequently finalize.

All of this points to:

  1. L2 proposer cooperation — we want proposers in each slot to append to a single chain.

  2. L2 batch posting cooperation across extended periods of time — everyone trusts batches will be published to L1, ideally without race conditions between batch posters, as this would drive up costs.

  3. L2 proposer rotation decoupled from batch posting.

Protocols Satisfying Aztec’s Desired Properties

In this section, we explore two protocol classes that Aztec could implement that stand to satisfy the desired properties of the previous section. We first introduce a novel Attester-Enforced Preconfirmation Chaining protocol which avoids using consensus, and discuss how in theory it would satisfy each of these properties. We then discuss consensus as a viable protocol path to also satisfying each of these properties too.

Attester-Enforced Preconfirmation Chaining (AEPC)

The protocol creates a sliding window of enforceable proposer commitments where future proposers “lock in” to past preconfirmed L2 chains before the L1 actually finalizes them. This protocol achieves decoupling of L2 Block Proposals (fast, pipelined) from L1 Batch Posting (slow, expensive)— enabling strong degrees of the properties we set out to satisfy.

Note: We are still considering key details around incentives and slashing. The description in its current form is meant to demonstrate that consensus-free improvements to Aztec’s existing block production mechanisms are possible, and that there is scope for these protocols to compete with the guarantees of consensus protocols, with appropriate incentives in place. That said, L2 consensus protocols among permissionless proposers have many open questions themselves, which makes this introduction of AEPC a valuable contribution in its current form.

Protocol Roles

  • proposer: as before, elected pseudorandomly from the set of validators, assigned a unique L2 slot in which they have exclusive rights to propose a block.

  • attestor/attestor committee: as before, elected pseudorandomly from the set of validators, required to attest to the validity and DA of L2 blocks. In AEPC, attestors will also need to track proposer commitments to preconfirmed chains.

  • batch posters: entities responsible for posting L2 batches to L1.

Protocol Description

The core protocol functions much the same as the existing Aztec protocol where L2 proposers propose blocks to attestors, with the attestors attesting to the validity and DA of the L2 block, allowing the block to be accepted to L1. However, we give proposers and attestors several additional actions and validity checks to enhance the core protocol, described below.

0. Contract Changes

Attested L2 blocks can be published to L1 at any time after the L2 slot has passed —likely within some liveness bound, although such a bound is not clearly necessary, given we rotate attestors and committees as per the core protocol.

1. The Signal (Commitment)

At the end of a particular slot N-1, proposer N (and potentially proposer N+1, N+2, …) must publish to the L2 P2P messaging layer a cryptographic signal indicating exactly which previous L2 block they intend to build upon — whether that’s proposer N-1’s block, or some other previously attested/confirmed block.

  • Constraint: This signal is binding. Proposers must follow the preconfirmed chain that they signal for.

  • Enforcement: The ****attestor committee will reject any block from proposer N that does not match their broadcasted signal(s) from previous slot(s).

2. The Optimistic Flow (Extension of Time)

  • Action: Proposer N signals they will build on proposer N-1’s chain.

  • Result: Batch posters are granted an additional L2 slot’s worth of time to post proposer N-1’s preconfirmed chain to L1. They do not need to post immediately because, through the signal, the next leader has already guaranteed they won’t be allowed to fork the previous block out.

  • Benefit: Allows for batch aggregation and creates a “pipeline” where proposer N builds on N-1’s preconfirmed chain while the data is being published to L1.

3. The Pessimistic Flow (The “Force Publish Mode”)

  • Action:

    • Reorg signal: Proposer N signals a reorg (i.e., they signal for Block N-2 instead of N-1)

    • No signal: Proposer N fails to signal at all.

  • Reaction:

    • Reorg signal: The protocol should trigger a “Force Publish Mode”—maximum urgency to publish unpublished blocks from the preconfirmed L2 chain to L1**.** The batch posters immediately stop optimizing and race to publish the honest chain (up to N-1) to L1 — who the batch posters are and how they are incentivized to race would be a key design question for this protocol.

    • No signal: Attesters will refuse to attest to any block from proposer N.

  • Proposer N’s Gamble (Reorg signalled): Proposer N is now in a race. If N-1’s batch lands on L1 before N’s publication slot begins, N’s reorg attempt is invalidated and they lose their slot reward.

Strategy Outcome Summaries

  • Cooperation: If everyone signals cooperatively, block proposal decouples from block posting to L1, meaning posting costs go down, slots can rotate faster, and L2 preconfs become reliable over extended periods of time.

  • Adversarial: If someone tries to reorg, this triggers a “Force Publish Mode” that should force the preconfirmed chain being reorged to be published to L1, likely causing the reorging attempt to fail and the reorging proposer to lose out on dollar bills.

Cheeky Insight: The more slots ahead we require proposers to signal which preconfirmed chain they will follow, the shorter reorgs can be e.g. if proposers [N+1,…, N+10] signal for proposer N’s block, the system knows that there are at least 10 L2 slots to publish block N to L1. However, the further ahead we require proposers to signal, the more dependent we become on synchrony e.g if we require all proposer to signal 10 slots ahead, and there is a network jitter in slot N preventing signals from arriving to the attestors, the above protocol would say attestors should ignore proposals from all proposers [N+1,…, N+10] — 10 slots of network downtime, nicht so gut.

Concluding Thoughts on AEPC

Although AEPC avoids consensus, it has many open questions that are less explored than those of consensus protocols. That being said, the relative simplicity of the protocol and its attack vectors means AEPC, or variations of it, can almost definitely be properly specced and analysed before consensus gets close to implementation on mainnet. If this design is appealling, or you have some thoughts on how it can be improved/analysed, please reach out. Regardless, the problems of batch poster incentives and coordination are common to any protocols looking to optimize L2 transaction batch posting to L1, so time spent on these common problems is time necessarily spent.

Consensus

Conceptually, consensus is not too big of a leap from the other committee-based designs from earlier in the document. Consensus favours safety and liveness over simplicity, which is not always a clear cut decision. For a permissionless network like Aztec, the competitiveness and repeated tethering to L1 that AEPC brings might be preferred over long-running L2 preconfirmed chains through consensus.

On the properties of consensus, consensus achieves strong degrees of the properties we introduced at the start. For Aztec, dedicated off-L1 consensus could mean the protocol can wait hours between batch submissions to L1 before timeouts. To fully understand the implications and limits of what consensus can achieve, and the relative complexity of consensus vs alternatives, we at Nethermind will continue our explorations over the coming weeks and months.

Coupled with a batch submission coordination protocol, a consensus protocol seems to scratch all of the itches that the previous committee-based protocols flared up around “batch submission” and “off-chain block confirmation strength”. However, like AEPC, batch posting, batch poster coordination and incentives are all open questions — questions that generally remain to be answered for permissionless proposer L2s.

How to Implement Consensus for Aztec

A fully permissionless L2 consensus-based preconfirmation protocol has not been achieved in production, to the best of our knowledge. Doing so for the first time with Aztec will be a complex multi-month challenge.

That being said, Aztec’s starting point of permissionless proposing with a committee-attestation requirement means the challenge is much more manageable than the challenge that would be faced by centralized sequencer L2 considering a pivot to permissionless consensus.

Aside: When applications become dependent on sequencer trust, removing this trust could mean multi-million dollar exploits for bridges and applications that assume no reorgs, specific sequencing rules, or short-term bribery-resistant censorship resistance.

Nethermind will continue to work closely with Aztec on the question of if and how consensus should be implemented. We will provide more details on specific consensus designs tailored to the Aztec protocol in subsequent posts.

Other Unsolved Mysteries

Optimised L2 batch submission to L1

One of the key unsolved questions for permissionless proposer rollups like Aztec is how to coordinate the publication of L2 transaction data to L1. For Aztec, this is particularly important given its expectation to be an L1 DA guzzler. Even small optimizations to the L1-posting schedule can mean significant cost savings for batch posters, and in turn, for users.

For example, in the current Aztec protocol, if L2 proposers started creating multiple blobs per L2 slot (currently expected to be 6 L1 slots), this in isolation would create a cyclic pattern where blob basefee would likely spike after an Aztec batch arrives on L1. As such, other more nimble rollup data posters (e.g. centralized sequencer rollups who can theoretically wait hours before posting to L1, as opposed to the 10s of seconds that an Aztec proposer has) would always try to front-run the Aztec proposers to avoid the subsequent spike that Aztec would create with multiple blobs per L1 batch submission.

A highly restrictive batch submission requirement is one of the main motivators for some sort of off-L1 L2 preconfirmation protocol over multiple L2 slots. With longer times between batch submissions, Aztec batch posters can avoid ~deterministic and exploitable batch submission strategies, and gain the ability to wait-out short-term gas spikes — without missing their batch submission slot.

A concrete and optimised batch-submission protocol will be needed in any L2 preconfirmation protocol, and will be one of the focuses of the team seeking to implement a robust L2 preconfirmation protocol for Aztec.

If the committee isn’t needed as a training-wheel in the future, will consensus?

Recall that the committee in Aztec is needed as an extra layer of protection in addition to the proof system to ensure that only valid blocks can be proposed. If we get to a point in the future where proof-system bugs have a negligible chance of occuring, we can theoretically remove the committee from the process of signing for block proposals.

Let’s say we remove the need for a committee. If we stick to the goals of “batch-submission optimization” and “strong off-chain confirmations between batches on L1”, then we have two broad-stroke options:

  • Consensus — keeping the committee in the critical block validation path.

  • A high-resource sequencer/set of high sequencers who is answerable to the token holders, but plays a similar role to a centralized sequencer in that they have extended periods of sequencing monopoly, and comparable amounts of time to submit batches to L1 as any consensus protocol being considered. Such sequencers can be delegated tokens, or elected by the token holders/token holder representatives e.g. governance. The key requirement here would be that sequencer incentives are directly correlated to the long-term success of Aztec, and that we retain censorship-resistant Stage 2 machinery in place to keep any sequencers in-line.

High-resource sequencers tied to the long-term success of their L2 are basically optimal in both of our goals.

  • batch-submission optimization

  • strong off-chain confirmations between batches on L1

However, such sequencers may be considered sub-optimal in terms of short-term censorship resistance — extended control over L2 transaction inclusion/execution means an extended ability to censor— especially compared to a regularly-rotating set of permissionless proposers. To combat this, we could reintroduce a forced inclusion committee, but this process could take many years, with the gains hard to quantify with so many unknowns, and training wheels a must. And don’t forget, permissioned proposers would then stand as targets for governmental agencies — a motivation for permissioness proposers from day 1.

What other alternatives are possible?

Can we combine high-resource sequencers with a censorship-resistant committee who also enforce block-validity? What other combinations may exist? We’d love to chat with anyone with any thoughts or insights here.

6 Likes

Thanks for the writeup @The-CTra1n, some great thoughts here. A few comments from me:

On AEPC, what would happen on the “force publish mode” if there’s L1 congestion and we cannot checkpoint to L1? Does the chain remain in a sort of “fork”, where each node follows along depending on what they were able to validate? I think I need more clarity on what are the “times” in this protocol, like how much time is allowed for a batch in the “force publish” mode.

I agree in that definining batch-poster incentives is one of the most challenging parts here. Easiest solution seems to be aligning this to proposers: we keep a “incentive pool” that increases consistently with every unpublished block, and every slot the proposer has the option to post to L1 and get the contents of the pool. If they don’t make it, that just rolls over to the next proposer, and we expect the incentive to grow enough over time (with a deadline, just in case) so that some proposer will be willing to pay enough to get the batch included in L1. But we also need to define how this plays with epoch boundaries (where we rotate committees) and proof submission windows (can a prover submit a proof that includes blocks not yet checkpointed to L1?). Or maybe we can just couple batch-posters with provers?

As for consensus, I just wanted to clear a misconception. The committee is not just for training wheels for the proving system, by attesting to the resulting state root after execution. It also attests to the availability of the txs involved in each L2 block, which is something we’ll still need even after having full trust on the proving system, unless we can nail real-time proving, which doesn’t seem like it’ll happen very soon.

And on high-resource sequencers, would this be a sort-of proposer-builder separation? I’d expect that increasing resources for validator nodes is off the table. Or is the idea to decouple proposers and validators?

Last but not least, I asked Claude for better backronyms for AEPC:

  • SALSASignal-Attested Locking for Sequential Anchoring
  • LATCHLedger-Anchored Temporal Chain Handoff
  • CLASPCommitment-Locking Attester Signal Protocol
  • CLANKChain-Locking Attester-Notarized Keeping
  • CLINKCommitment-Linked Interlocking Notarized Keeping

For sake of visualization, the timings look like a 1-L2 slot build ahead design (with 6 L1 slots per L2 slot) – proposer of slot N must signal to attestors regarding L2 slot N-1 at L1 slot M, propose to the attestors before L1 slot M+6, then from slot M+7 onwards is allowed to publish their L2 block to L1. Unlike the build ahead, the only cut-off regarding publication deadline would be someone else publishes a conflicting block.

Conflicting blocks can be earlier proposers than proposer N whose fork was put into force publish mode by proposer N’s signal, or later proposers than proposer N, with these later proposers triggering force publish mode on proposer N’s fork. So optimistically with all proposers signalling cooperatively there are no deadlines for L1 publication (except epoch changes). Pessimistically, every 2nd proposer would have 6 slots to publish to L1, and every other proposer (who signalled a fork attempt) would miss their proposal slot. Signalling a fork gives your previous proposer ~6 L1 slots of a headstart to publish to L1 – so your fork attempt is likely to fail.

From the L1 contract perspective, it doesn’t know anything about force publish, it just accepts any blocks building on the most recent block in the L1 contract. In force publish mode, there would be 2+ valid forks possible, but only one could land on L1 as they would (should) have a conflict at the point when the singal was made – attestors would enforce the forks are valid, but there would need to be a chain of block headers/hashes that the L1 contract could also check. Nodes can always safely derive the L2 state from blocks that land on L1 as no conflicts are possible, conditional on L1-published blocks being proven

Yeah that should work – but only really becomes necessary with something like AEPC (pending renaming) or consensus to allow for potentially long periods of non-posting. I also imagine we want a “boost” to this pool, where we start to siphon %s of proposers’ stake into the reward/bounty pool. Conditional on batch N not getting submitted within 1/2/more L2 slots, the block reward alone might be too small. Both proposer N, and subsequent proposers should become progressively more aggressive with their posting strategy, even in the optimistic case where they are not competing against each other / all-or-nothing hard deadlines to publish.

You’re exactly right, I should have expanded a bit more on this line of discussion. A better statement therefore would be “without proof system bugs, we only need attestors for ensuring proofs can always be delivered”. However, heavy-duty sequencers that have a lot of skin-in-the-game can also provide that guarantee. If they ever failed e.g. for an hour, back-up sequencing can kick in.

At the limit, it would probably be more like the relatively low resource entities would form a slow/async governance set, maybe with some anti-censor capabilities, and high resource entities can be delegated stake by validators/elected by governance and held accountable for any maliciousness. Etherlink does something like this, albeit with L2 sequencers elected by L1 validators. But this path is conditional on many things, including proof system strengthening and probably higher throughput capabilities to justify powerful sequencers, so this can be revisited in a while.

Fitting, given I had this at one point in the version history :sweat_smile:

image

Give me SALSA every day of the week!

2 Likes