Aztec's New Slashing Design

Aztec deployed its original slashing system to Adversarial Testnet in late July for testing. The system did not meet product requirements with regards to its ability to quickly eject inactive validators from the validator set.

In response, the team has re-designed the slashing mechanism. This post introduces what we call the Tally Model of slashing.

This document may use proposer and validator interchangeably.

A validator is an entity that posted stake and has been added to the validator set. A proposer refers to a validator who has been picked as part of the committee to propose a block for a given L2 slot.

Introduction

The Tally Model slashing mechanism uses consensus-based voting where proposers vote on individual validator offences. Time is divided into rounds, and during each round, proposers submit votes indicating which validators from a given past round should be slashed.

// Relevant Rollup Config (L1 immutable parameters)
uint256 slashingRoundSize: 128; // Size of a voting Round: 128 L2 slots which is 4 epochs. 
uint256 slashingOffsetInRounds: 2; // Validators in Round N, vote to slash validators from N-2 rounds ago. 

Notice the first round is a “Grace Period”. This is a configurable parameter on the node (and not on the contract).

slashGracePeriodL2Slots = 32 * 4

Thus given the above slashingOffsetInRounds=2 parameter, there is no voting happening in Round 1 and Round 2. In Round 3, nodes could vote to slash offences committed 2 rounds ago or in Round 1. But given that most nodes by default have the first round as a grace period, it is likely that voting will start in Round 4 where proposers will vote on offences committed in Round 2.

What is a vote?

We’ve now established the basics. Proposers in a Round N will only vote on offences committed by the validators picked for committee duty during Round N-slashingOffsetInRounds

Votes are encoded as a byte array, with each validator’s vote represented by two bits specifying the proposed slash amount (0–3 units). The L1 contract aggregates these votes and applies slashes to validators that reach quorum.

Proposers of Round N choose 1 of 4 values to slash each validator in Round N-slashingOffsetInRounds

  • NO_SLASH

  • LOW_SLASH

  • MEDIUM_SLASH

  • HIGH_SLASH

These are represented on L1 by the immutable variable

  uint256[3] slashAmounts: [2_000e18, 10_000e18, 50_000e18];

No other slashing amounts may be used. The amounts above correspond to 1%, 5% and 25% of stake. This means that any validator may be slashed up to a maximum of 25% of their stake.

The slashAmounts parameters are set on deployment and can only be changed via a governance vote.

Slashable Offences

There are 6 slashable offences that the node will automatically detect and vote to slash validators accordingly.

1. Inactivity Offence

The biggest risk to the Aztec Network is the loss of liveliness due to the validator set going offline for any number of reasons. This is why validators who go offline are slashable. Being offline is defined as not attesting to SLASH_INACTIVITY_TARGET_PERCENTAGE % of block proposals in SLASH_INACTIVITY_CONSECUTIVE_EPOCH_THRESHOLD consecutive epochs where the validator was picked for attester duty.

 ## Relevant env vars on the Node. 
 | `SLASH_INACTIVITY_PENALTY` = 1% 
 | `SLASH_INACTIVITY_TARGET_PERCENTAGE` = 90%
 | `SLASH_INACTIVITY_CONSECUTIVE_EPOCH_THRESHOLD` = 2
 | `SENTINEL_HISTORY_LENGTH_IN_EPOCHS` = 100
 | `SENTINEL_ENABLED` = TRUE <!-- Sentinel must be enabled in order to detect inactivity -->

Given the above parameters, a validator will keep track of attestations it’s seen, via both the p2p and L1, from all other validators who were selected as part of the committee in the previous SENTINEL_HISTORY_LENGTH_IN_EPOCHS epochs.

Once it detects that a particular validator has failed to attest to at least SLASH_INACTIVITY_TARGET_PERCENTAGE % of block proposals in at least SLASH_INACTIVITY_CONSECUTIVE_EPOCH_THRESHOLD consecutive epochs, it will consider that validator to be inactive.

A missed block proposal contributes to the inactivity score count in the same way as a missed attestation.

Suppose a particular validator 0xdead has become “inactive” in Round N+3, then proposers in Round N+5 will vote to slash that particular validator.

As can be seen in the diagram above, as of the end of Round N+3, the validator in question is now guilty of being offline according to the above parameters. Since they become guilty in Round N+3, only the proposers of Round N+5 can vote to slash them.

If the validator meets attestation requirements during any epoch, the count of consecutive inactive epochs is reset to 0. In the 2nd diagram above, the validator is not considered offline by the end of Round N+3.

A proposer who detects an inactive validator will vote to slash them for SLASH_INACTIVITY_PENALTY % of their stake. Note this can only be one of the pre-defined slashing amounts on the rollup. We expect the invalid validator penalty to be 1% of stake.

Nodes must keep large amounts of activity data as traced by the Sentinel (node module which aggregates attestation/proposal data) since the larger the validator set, the less frequent validators will be elected to the committee. Fortunately the Sentinel data is much more compact than retaining the entire p2p data.

2. Epoch Prune Offence

At the end of epoch N , provers have a proofSubmissionWindow of epochs to submit proofs. By default, this submission window is set to 1 epoch which means the proof for epoch N must land before epoch N+1 finishes. If no such proof lands, then epoch N and anything built on top, is pruned (erased) and the chain restarts from the tip of the proven chain as of epoch N-1

Upon detecting a chain prune, the node will try to verify if the epoch could have been proven. It does so by re-executing the txs from the blocks that were pruned. If all txs are available, and the state roots from L1 match those on the node after re-execution, the node will then try to slash the entire committee on the pruned epoch.

If a valid epoch is not proven, the committee is on the hook. This means committee members rely on a single honest prover assumption.

3. Data Withholding

Similar to the above Epoch Prune Offence, upon detecting a chain prune, the node will try to verify if the epoch could have been proven. It does by re-executing all txs from the blocks that were pruned. In the case where all or some of the txs are not available, the node will then try to slash the entire committee on the pruned epoch.

The committee is responsible for propagating the tx data to the sequencer set before the end of proofSubmissionWindow otherwise they’re on the hook if no tx data is found in the case the epoch is pruned.

Both offences 2 and 3 suggest that the committee is responsible for proving every valid epoch. Severe p2p partitions could lead to situations where provers are unable to obtain data, and a majority of sequencers are led to believe the tx data is not available which would lead them to slash the committee members.

4. Invalid Signatures

Currently, a proposer needs to collect ECDSA signatures from \\frac{2}{3} +1 of the committee and post these signatures to the L1 rollup contract for signature verification. In our benchmarks, a median propose call with a committee size of 48 validators costs 448,378 gas, and removing the ValidatorSelectionLib.verify function brings down this cost to 286,616, saving 161762 gas (36% cheaper).

This has led the team towards considering delayed signature verification where signatures are collected and posted to L1, but are not verified. L2 nodes would take on the responsibility of verifying that these signatures are valid before recognizing an L2 block as valid.

The exact mechanism of delayed signature verification is beyond the scope of this document but when nodes detect that a proposer has posted invalid or insufficient signatures, it will attempt to slash the proposer in question.

5. Attesting on Invalid Blocks

Before attesting to a proposal, validators check that the parent block hash is valid. Assuming validators only accept an L2 block if it has its attestations, as per the section above, then they should not sign a block proposal unless it builds off a block that has all attestations.

This means that attesting to a block is equivalent to attesting to it and to all its previous blocks in the epoch, and that these previous blocks have their attestations posted to L1.

Any sequencers who attest to blocks with invalid parent blocks will be subject to slashing by honest sequencers.

The Slashing Vetoer

The slashing vetoer is an independent group of security researchers who assess upgrades and can pause slashing proposals when necessary to protect network interests.

All Aztec slashing proposals require votes from Sequencers and ratification by the L1 contract as described above. Once ratified, these proposals enter a slashing execution delay period of slashingExecutionDelayInRounds expected to be approximately 3 days.

During this delay, the vetoer can block a slashing proposal from execution. They can also temporarily disable all slashing for up to 5 days—a period that can be extended if needed.

This system serves as a failsafe to protect Sequencers from unfair slashing that might result from client software bugs.

Relevant Node Settings

The node software contains other slashing-related config that is only relevant to the previous Empire Model of slashing which is not discussed in this document.

`SLASH_GRACE_PERIOD_L2_SLOTS:` Number of initial L2 slots where slashing is disabled
`SLASH_OFFENSE_EXPIRATION_ROUNDS:` Number of rounds after which pending offenses expire
`SLASH_VALIDATORS_ALWAYS:` Array of validator addresses that should always be slashed
`SLASH_VALIDATORS_NEVER:` Array of validator addresses that should never be slashed (own validator addresses running on the same node are automatically added to this list by default)
## Validators in slashValidatorNever are prioritized over those in slashValidatorsAlways. Meaning if a validator is on both lists, they won't be slashed
`slashSelfAllowed?:` boolean; // Whether to allow slashes to own validators (optional - used to not add your own validators to the `slashValidatorsNever` array)
## There is no env for slashSelfAllowed and it can only be set by calling the `node_setConfig` API method.
`SLASH_INACTIVITY_TARGET_PERCENTAGE:` Percentage of misses during an epoch to be slashed for INACTIVITY
`SLASH_INACTIVITY_CONSECUTIVE_EPOCH_THRESHOLD:` How many consecutive inactive epochs are needed to trigger an INACTIVITY slash on a validator
`SLASH_PRUNE_PENALTY:` Penalty for VALID_EPOCH_PRUNED
`SLASH_DATA_WITHHOLDING_PENALTY:` Penalty for DATA_WITHHOLDING
`SLASH_INACTIVITY_PENALTY:` Penalty for INACTIVITY offenses
`SLASH_INVALID_BLOCK_PENALTY:` Penalty for BROADCASTED_INVALID_BLOCK_PROPOSAL
`SLASH_PROPOSE_INVALID_ATTESTATIONS_PENALTY:` Penalty for PROPOSED_INSUFFICIENT_ATTESTATIONS and PROPOSED_INCORRECT_ATTESTATIONS
`SLASH_ATTEST_DESCENDANT_OF_INVALID_PENALTY:` Penalty for ATTESTED_DESCENDANT_OF_INVALID_BLOCK

All penalties should map to one of the slashingAmounts. A penalty lower than the smallest slashing amount will not be executable, and a penalty greater than the maximum will be capped at the maximum value.

Proposed Defaults for Ignition Network

The following L1 slashing variables are only configurable via a governance vote. It is expected that these defaults will be used in the first rollup deployment on Mainnet Ethereum.

`slashingRoundSize:` 128 ## Number of votes in a round
`slashingQuorumSize:` 65 ## Votes required to slash 
`slashingRoundSizeInEpochs:` 4 ## Number of epochs per slashing round
`slashingOffsetInRounds:` 2 Rounds ## How many slashing rounds back we slash
`slashingExecutionDelayInRounds:` 28 Rounds (~3 days) ## Rounds to wait before execution
`slashingLifetimeInRounds:` 34 ## Maximum age of executable rounds
`slashingAmounts:` 1% 1% 1% ## Valid values for each individual slash

The following Node slashing variables are configurable on the node and will require node operators to agree on common defaults. Aztec Labs expects to ship the node software with these default vars.

`SLASH_GRACE_PERIOD_L2_SLOTS:` 128 (1 Round)
`SLASH_OFFENSE_EXPIRATION_ROUNDS:` 8
`SLASH_VALIDATORS_ALWAYS:` Empty
`SLASH_VALIDATORS_NEVER:` 
## Own validator addresses running on the same node are automatically added to this list by default
## Validators in slashValidatorNever are prioritized over those in slashValidatorsAlways. Meaning if a validator is on both lists, they won't be slashed.

`slashSelfAllowed:?` false; 
## (optional - used to not add your own validators to the `slashValidatorsNever` array) 
## There is no env for slashSelfAllowed and it can only be set by calling the `node_setConfig` API method. 
`SLASH_INACTIVITY_TARGET_PERCENTAGE:` 90%
`SLASH_INACTIVITY_CONSECUTIVE_EPOCH_THRESHOLD:` 2
`SLASH_PRUNE_PENALTY:` 0%
`SLASH_DATA_WITHHOLDING_PENALTY:` 0%
`SLASH_INACTIVITY_PENALTY:` 1%
`SLASH_INVALID_BLOCK_PENALTY:` 1%
`SLASH_PROPOSE_INVALID_ATTESTATIONS_PENALTY:` 1%
`SLASH_ATTEST_DESCENDANT_OF_INVALID_PENALTY:` 1%

These proposed defaults are not yet final and are subject to changes. They provided on the basis of helping readers understand how the slashing system is expected to work in production.

Ejection Threshold

To join the sequencer set, sequencers must put up a minimum amount of stake defined to be the activationThreshold which is fixed per GSE contract. The Aztec rollup contracts however also define an ejectionThreshold which is specific to every rollup. If a slashed sequencer’s stake falls below the ejectionThreshold they are automatically exited from the set and their unslashed balance will be sent to the withdrawer address that they registered when they first entered the set.

The proposed ejectionThreshold for Ignition Network is 98% which means that a sequencer could get slashed up to 3 times before they are exited from the set. Coupled with the fact that the largest slash amount is 1%, operators stand to lose no more than 3% of their stake for inactivity during Ignition Network.

6 Likes

Clarification 1

Once it detects that a particular validator has failed to attest to at least SLASH_INACTIVITY_TARGET_PERCENTAGE % of block proposals in at least SLASH_INACTIVITY_CONSECUTIVE_EPOCH_THRESHOLD consecutive epochs, it will consider that validator to be inactive.

If SLASH_INACTIVITY_TARGET_PERCENTAGEis set to 80% then validators must miss at least 80% of the attestations / proposals in an epoch to be considered inactive for the epoch. At Ignition, it is expected that slot times are 72 seconds, and so missing an epoch means a sequencer is offline for at least 30 mins (~25/32 slots).

Since SLASH_INACTIVITY_CONSECUTIVE_EPOCH_THRESHOLD=2 this must happen twice in a row for a validator to be eligible for slashing from the perspective of other nodes. To be slashed for inactivity may require long downtimes since it is unlikely that a validator is picked into the committee in consecutive epochs.

For example, if a validator is picked for the committee in epochs N and N+5 ( and never in between) and they were found to be inactive in both epochs, it is highly likely that the validator is offline for the entire 6 epochs or ~4 hours. This would lead other validators to vote to slash the validator in question for 1% since SLASH_INACTIVITY_PENALTY:1% (more accurately 2,000e18 or 1% of the proposed activation threshold of 200,000e18)

nicely done, this looks way off the devcentric mode lolz .. but is all good

Vote-to-slash is fundamentally broken and should be removed entirely.

This is provable, therefore no vote is necessary (like Ethereum, leak the balance and remove via proof when insufficient).

Because blocks are attested as being available and provable by the committee when posted, these slashings should be automatic.

Provable.

Instead, have slashed funds locked up temporarily such that the committee can undo it.