Update: The latest version of this proposal can be found here.
Fair Election Randomized Natively on Ethereum Trustlessly (Fernet) is similar to Whisk-y, Schmequencer, and Irish Coffee in that it relies on randomness for sequencer selection. However, it doesn’t select a fixed set of leaders, but assigns a weight to each of them to secretly rank them. Whether a sequencer chooses to propose a block or not depends on their expected rank.
This proposal also draws from B52 in that there’s a strong separation between sequencing and proving. Since sequencing is cheap, we can deal with duplicated efforts. And once tx ordering is set, then the proving work is well-defined. This purposefully removes one of the sequencer responsibilities requested in the RFP: Build a rollup proof from completed proofs in the proof pool.
Building blocks
VRF: We first define a VRF over the RANDAO beacon bound to the sequencer keypair, such as a BLS signature over the RANDAO value, or a small SNARK as suggested in Irish Coffee. This gives us a secret, random, verifiable value, bound to each sequencer, that updates with every Ethereum block.
Staking contract: We keep a staking contract on L1, where sequencers can register themselves with a public key, and stake a fixed amount of funds to participate. We introduce an unstaking delay to allow for slashing dishonest behaviour.
Rollup contract: A rollup contract on L1 that tracks the sequence of txs and the proofs. Note that these will be submitted separately.
Protocol
Sequencer submits order: On each L1 block, any sequencer can submit a tx ordering to the L1 rollup contract. We call this an “unproven block”. The data published is an ordered list of L2 tx hashes, an identifier (the hash of the list), the identifier of the previous block that this one builds upon, and the sequencer VRF output.
The unproven block could also include the tx’s payload. See the questions section for tradeoffs on this decision.
The contract needs to store only the identifiers; the raw list of txs can be kept in calldata, but it’s important that it’s made public at this stage.
In case there are multiple submissions, the contract will keep track of all of them. However, the protocol considers the canonical one as the one where the VRF output is closest to the current RANDAO value, among all the valid ones.
An unproven block is considered valid if all its published tx hashes correspond to valid L2 txs, they are not repeated, their ordering is correct, and its previous block is canonical. This notion of validity cannot be enforced by the L1 rollup contract, but can be easily verified by an L2 node.
Once the L1 block is mined, any L2 full node can pick the canonical unproven block, check whether it is valid, and execute the state transition function. This provides an early optimistic finality for L2 txs.
If no valid unproven blocks are published in block N, then in block N+1 sequencers should reference the canonical unproven block from block N-1.
Proving starts: Every L1_BLOCKS_PER_ROLLUP L1 blocks, provers pick the canonical chain of unproven blocks since the last proof, and start proving.
We can set L1_BLOCKS_PER_ROLLUP to 1 if we want a proof for every block. However, due to the high cost of verifying proofs on-chain, it may make sense to batch a few blocks. For instance, we could set L1_BLOCKS_PER_ROLLUP to align with Ethereum epochs.
How provers are selected is beyond this proposal, but we could favor a subset of provers to reduce duplicate work. For instance, the reward that a prover gets could be proportional to 1) being selected in a committee using a strategy similar to Ethereum’s, 2) the inclusion delay, so other provers can jump in if the selected ones take too long, and 3) the rewards of the proofs being aggregated.
Proof is submitted: Eventually the provers submit their proof for the canonical block, which becomes a “proven block”. We now know that this block is valid, but it may not be the canonical, since provers could submit proofs for any of the valid blocks.
Again, anyone may submit a proof for the canonical block, and selecting a winner is out of the scope of this proposal. Still, following the idea from the point above, we could choose the canonical proof as the one with the highest accumulated rewards.
If the rollup contract receives proofs for more than one valid block, then it can decide which one should be the canonical block by checking the stored VRF outputs. We could slash provers for submitting a proof for a non-canonical block.
Proof is final: After N blocks, we assume that honest provers have had the time to submit proofs for the canonical block, so we consider the current proven block as “finalized”.
Only at this point are rewards paid out.
It may be the case that the canonical block didn’t get any proof submitted, but other valid ones did. If this happens, to ensure liveness, the protocol picks the current finalized block is the canonical one.
Design decisions
Not enshrining MEV at the protocol layer. Unlike B52, this design leaves the door open for hybrid PBS as works on Ethereum today: proposers are selected at random, and pick bids from builders from an off-protocol marketplace. While there are currently pushes for moving PBS to the protocol layer (thanks Joe for sharing that article!), I personally think that this is still an immature area of research and subject to frequent changes. I don’t think we can push research forward fast enough to come up with a secure design, much less implementation, within 6 months. Furthermore, having PBS as a sidecar allows us for rapid iteration on it, and gives the community the flexibility to outright remove it in the event of a critical bug without requiring a major fork.
Splitting sequencing and proving. While Goblin and Ultrahonk should deliver major improvements in proving times, at the moment we are looking at a proving time of 2-10 minutes per block (and that’s assuming maximum parallelization). Having an early checkpointing mechanism allows us to define the tx ordering quickly, which is the only indeterministic part of block-building (aside from who reaps the proving rewards), so clients can get quicker finality. It also leads to less wasted efforts: submitting a tx ordering that isn’t chosen as canonical is not a big loss, but submitting a proven block that doesn’t get picked up means significant wasted computing power.
Ranking over choosing leaders. Proposals like Schmequencer or Irish Coffee rely on choosing one or a specific set of leaders. This requires careful modeling (and potentially tweaking via protocol upgrades) to nail how many leaders would be probabilistically chosen. Here, having a ranking function for anyone who wants to participate allows for sequencers to quickly iterate different strategies. Furthermore, by splitting sequencing and proving, attempting to participate in a block is relatively cheap.
Single-block leaders. Unlike Irish Coffee, this design gives control to a sequencer for a single block. This prevents monopolization of the chain for long (as in 10 minutes) periods of time, which could lead to oracle manipulation attacks for example.
Requiring staking for participation. Unlike Schmequencer, we require sequencers to explicitly stake in the L1 for selection. This keeps the entire selection algorithm on L1, so early issues on L2 would not impact it. It also prevents accidental delegation of selection power: eg if I put my native token in a 4626-like vault, I’m unwittingly giving more chances to that vault to be selected as the next sequencer. Last, staking leaves the door open to implementing slashing (even social slashing) in the event of detecting malicious sequencer behaviours.
Open questions
Can we get a better backronym? I had to ask ChatGPT for help on the T. I originally had used “ta-da” which was fun but a bit too much. Still, I feel there’s room for improvement there.
What to stake? While we probably want to require a fixed amount (like Ethereum does with its 32ETH), the question remains of which currency to use. Options seem to be ETH, rETH, or a network native token. I understand that using a native token provides some additional defense against takeovers: if an entity purchases enough native token to take over the protocol, then the token value would drop to zero as the protocol utility disappears due to the takeover, and the unstaking delay means the entity cannot dump their tokens before they become worthless. But economic incentives are not my strong point.
How to slash? Related to the above, there are no clearly dishonest behaviours from a sequencer that could be automated for slashing. We could have governance-powered slashing though, so we can have the community observe and decide to punish behaviours.
What VRF to use? This seems to be a recurring question across proposals, so I’ll leave it to the experts. Goal is to get something simple and cheap to verify in the EVM.
What data should be included in an unproven block? If the unproven block only includes the tx hashes, there’s an attack where a sequencer could withhold one of the txs, preventing the prover network from being able to generate a proof for it, and reselling block-proving rights (or grieving provers by waiting for them to submit a proof for a non-canonical block, and then slashing). This can be mitigated by posting all tx data onchain, but this can quickly get prohibitively expensive, since each tx includes a ZKP which is not particularly small. We could look into alternative DA solutions, 4844 included, to solve this.
I’m generally a fan of keeping as few denominations as possible in the greater ethereum ecosystem. But indeed, the example of a bad actor keeping X ether, watching for any new protocols where the total participation is likely < X ether and doing ‘bad stuff’ without any real stake is discouraging.
There’s some combinations of native token and ether staking, the simplest being allow some quota of ether staking. Implementation is easy if you do it by allowing some guaranteed but limited ether <=> native swap line, but you’d need to make it sybil resistant to avoid just having a whale take the whole quota, and that I think is not so easy.
This seems to be the weakest link of the proposal. @zac-williamson correctly pointed out that posting all individual tx proofs, even via 4844, will be extremely costly. Each proof can grow to up to 32kb, so posting to a data blob would cost about 32k gas per tx (which is more than what an individiual ETH transfer on L1 costs, assuming equal pricing for data blobs and execution layer). There’s still the option for exploring other DA solutions like Celestia, but building consensus on top of them feels scary.
Note that the scenario of the sequencer purposefully withholding data is rare, since they would forego their associated fees. Even if they do so to resell proving rights, bypassing the public prover network, the overall network still gets a new proven block, which is the ultimate goal here. However, we still need the address the scenario where a prover publishes a tx hash for which other parties in the chain don’t have the preimage, and then implodes.
One option, suggested by @zac-williamson, is to compress tx proofs. Instead of submitting the user-generated proofs to a data blob, the sequencer would submit a groth16 compressed proof. However, producing this proof is computationally expensive, which goes against the sequencer hardware requirements, and adds an annoying 10-30s of latency to getting optimistic finality for a submitted tx.
We should still explore other solutions that do not require posting all proof data on chain. For instance, B52, relies on provers agreeing to prove a block, which they wouldn’t do if they don’t have the available tx data.
We reflect this principle in our approach to data availability (DA) as well. The standard requirement to disseminate all transaction data among every node participating in consensus before a decision can be reached presents a fundamental bottleneck in the throughput of sequencing (i.e., consensus) protocols. Some designs therefore make a performance/security tradeoff by outsourcing data availability to a smaller set of trusted nodes. This set of nodes, called a DA committee, may also be randomly sampled from the larger staking set and rotated every epoch or even every block. The committees can continue to propagate data so that all nodes eventually catch up, but off the critical path of consensus progress. This increases consensus throughput by leveraging the fact that data propagates at higher throughput over a longer latency window, but can quickly become the consensus bottleneck if required to reach every node inside the short latency window in which a decision must be reached. An alternative solution is VID, which verifiably distributes erasure code shares of the data among all nodes participating in consensus such that any subset of nodes owning a sufficiently high fraction of the stake would be able to cooperatively recover the data. VID drastically reduces the broadcast communication complexity required to make a block of data available (asymptotically removing dependence on the number of nodes), but greatly increases the complexity of data recovery.
We do not consider committee-based DA solutions an acceptable performance/security tradeoff. The small committee initially responsible for availability of data at the moment a block is finalized can be bribed at relatively low cost. On the other hand, relying on VID solely for DA is far from pragmatic. Our approach thus composes these two methods to achieve security and pragmatism. Optimistically, high bandwidth nodes will make data available to all other nodes, and the availability of each individual block is also backed up by small randomly (unpredictably) elected committees. (By using a careful sampling method with flexible threshold parameters, the committees can optimistically be very small.) But given the risk of DDoS and bribery attacks on small committees, we use VID to provide a reliable (yet slower) backup path that guarantees availability as long as a sufficiently high fraction of all nodes (weighted by stake) is not compromised.
To put things into perspective, today we’re using 660 bytes per tx when we post to L1 (this is assuming 4 nullifiers, commitments, and public data writes per tx, all of which will probably go up). However, this can all be moved to 4844 data blobs. The constant data that needs to be published to L1 per block is about 596 bytes.
In Fernet, each sequencer commitment (unproven block) is at least a constant 96 bytes which needs to go onto L1 (tx list hash, previous block pointer, and VRF output), plus 256 bytes if we need to post a Groth16 proof of the VRF (this will depend on the VRF being used, and affects equally Fernet, Irish Coffee, and Schmequencer). We then have the actual tx hash list, which costs 32 bytes per tx and should be publisheable to a 4844 data blob. So far, so good.
The problem, as mentioned before, comes when having to share the full payload of each tx. Here we’re talking about 32kb per proof (which could go down to 256b switching to Groth16), plus 3kb per kernel circuit public inputs, plus public function call requests (assuming 8 args per call and a single public function call, about 360 bytes). So we’re at about 36kb per tx (0.1x switching to Groth16), which is a lot even for 4844 blobs. This may even be a lot for the p2p network itself, depending on the TPS we’re targeting.
After a long discussion with @Mike, we believe that data withholding issues (whether unintentional or malicious) are not a major problem for the proposal, since we have multiple candidate sequencers per slot. Given we can have multiple leaders per slot, which means multiple commitments to block orderings, we can expect provers (and nodes) to follow the one with the leading VRF for which they have all available data.
If this is the one with the leading VRF, then clients have a strong guarantee that this will become the canonical chain (only exception would be if the prover network times out).
If it is not, either a) the next VRF gets proven, which is the one that the network was following (assuming no splits), so we’re good, or b) the leading VRF gets proven somehow (let’s say the sequencer went on to prove on their own and withheld data from the proving network). If (b) happens, then we experience a reorg, but it shouldn’t affect liveness.
Comparison with Schmequencer
There are two major differences between Schmequencer and Fernet, each with their pros and cons:
Preconfirmations vs latency: Fernet pushes the block ordering commitment to L1, in the form of the “unproven block”, which works as a preconfirmation for L2 nodes (assuming no data availability issues as mentioned above). On the other hand, Schmequencer only pushes to L1 once the proof has been completed. This allows Schmequencer to keep adding txs to the block being built/proven, which can reduce time to finality of submitted txs. Fernet can mitigate this by having multiple unproven blocks per rolled-up block (ie commiting message orderings more frequently than it assembles proofs), but this increases L1 costs and could make reorgs deeper.
Staking vs balances: While Fernet requires staking for sybil resistance, Schmequencer keeps things simpler by snapshotting balances. This is explored in “Requiring staking for participation” in the proposal, but the simplicity of Schmequencer’s approach is still appealing.
However, more research is needed to determine if there is a valid DA layer that fits the requirements. We estimate that the proof size for each tx could be up to 32kb, so even 4844 blobs are impractical.
Why’s that? Do EIP4844 blobs not have ~125kb of space? Also there can be several of them per block.
The main feature introduced by proto-danksharding is new transaction type, which we call a blob-carrying transaction . A blob-carrying transaction is like a regular transaction, except it also carries an extra piece of data called a blob. Blobs are extremely large (~125 kB), and can be much cheaper than similar amounts of calldata.
That’d mean we can squeeze at most 4 txs per blob, which is way below the TPS we’re looking at. And that’s also assuming that no one else in the entire ethereum ecosystem is using the blobs!
Howdy folks! Following up here to share that this week we announced the decision to continuing to iterate and improve on fernet, versus b52 the other finalist candidate.
The current design is vulnerable to L1 proposer collusion. L1 proposers can buy L2 blocks with a single low VRF score, by censoring L2 proposers during the proposal phase.
If I understand correctly, you mean that the L1 proposer can simply choose what L2 proposals to include, right? If that happens, then the VRF scoring is rendered useless, but I’m not sure it affects liveness: it just changes the selection protocol to that of a based rollup, which is not necessarily bad I believe, since the barrier of entry to be an L1 proposer is also designed to be low. What do you think?
I agree. How about a based version of Fernet? Merge proposal/reveal phases into a single L1 block, and let anyone post sufficient collateral alongside block data.
IIRC, we had ruled out based sequencing since it doesn’t promote much diversity for L2 sequencers. Seems like incentives would lead to just a handful of builder-proposers pushing the blocks to L1 via a MEV sidecar. And if these go down it could affect L2 liveness.
@cooper-aztecLabs@jaosef do you remember if we had other reasons to not go with pure L1 based sequencing?
I agree (I see censorship more than liveness). I like the idea that L1 validators can choose to build their own L2 blocks if desired, moreso given PBS has not hit L1 yet. In-protocol provers would address this problem directly. However I think the solution looks less like B52 and more like PBPS (a single bonded first-price auction). What do you think?
Each L1 block, the contract accepts from anyone a deposit (stake) and block data (unproven). If a proof is not posted within the proving phase, the deposit is burnt. (This is close to what Taiko is now trying)