I’ve updated the Aztec3 book with the latest message specification of L2<->L1 messaging
https://hackmd.io/iV7gbg4-QiKalhwg05exbg
Feedback would be appreciated. I’ve attempted to encapsulate the ideas we discussed on our earlier calls.
I’ve updated the Aztec3 book with the latest message specification of L2<->L1 messaging
https://hackmd.io/iV7gbg4-QiKalhwg05exbg
Feedback would be appreciated. I’ve attempted to encapsulate the ideas we discussed on our earlier calls.
Hard-coding the hackmd’s contents here, in case the content ever changes or the link ever breaks:
:::info
PROPOSAL
Author: Zac
:::
N.B. many ideas here drawn from Mike’s writeup from April (https://github.com/AztecProtocol/aztec2-internal/blob/3.0/markdown/specs/aztec3/src/architecture/contracts/l1-calls.md)
This doc is split into 3 parts
Part One describes the design goals of this spec, the restrictions we’re working under and presents a high-level design with rationalisations provided for the design choices made (e.g. “what” and “why”, but not “how”)
Part Two lists worked examples of implementing some Aztec Connect functionality
Part Three describes a more detailed technical specification with no rationalisations (e.g. “how”, but not “what” or “why”)
What is the minimal-complexity mechanism to implement L1<>L2 comms?
All communication that crosses the L2<>L1 boundary is enabled via message passing.
Messages can be used to compose complex sequences of function calls in a failure-tolerant manner.
The following describes all possible interaction primitives between public/private functions, L1 contracts and L2 databases.
N.B. a unilateral function call is one with no return parameters. i.e. a function can make a unilateral call but cannot perform work on the results of the call.
What are the fundamental restrictions we have to work with?
Communication across domain boundaries is asynchronous with long latency times between triggering a call and acting on the response (e.g. up to 10 minutes, possibly much more depending on design decisions).
This doc follows the following heuristic/assumption:
L1 contracts, private functions and public functions are separated domains: all communication is unilateral
This doc defines no logic at the protocol level that requires callbacks or responses as a result of a message being sent across the public/private/L1 boundary.
These abstractions can be built at a higher level by programming them as contract logic.
The following image isolates the primitives from fig.1 to enable L2<>L1 communication.
All write operations take at least 1 block to process
e.g. If an L2 triggers an L1 function and that L1 function writes a message, it cannot be read by an L2 function until the subsequent block.
We introduce a “message box” database abstraction:
The goal is for L2->L1 messages and L1->L2 messages to be treated symmetrically.
The L1 messageBox is represented via a Solidity mapping in the rollup’s contract storage.
The L2 message box is represented via an append-only Merkle tree + nullifier tree.
However the interface for both message boxes is similar. The available actions one can perform on a message is identical for both message boxes:
For both L1/L2 messages, a message can either be validated or consumed.
A validate operation asserts that a message exists.
A consume operation will assert that the message exists. The message is then deleted.
A message is a tuple of the following:
destinationAddress
messageData
For L2->L1 messages, destinationAddress
is the address of the L1 portal contract that is linked to the L2 contract that create the message.
The destinationAddress
is defined by the Kernel circuit, not the function circuit (i.e. an L2 contract can only send messages to its linked portal contract)
For L1->L2 messages, destinationAddress
is the address of the L2 contract that is linked to the L1 portal contract that created the message.
The destinationAddress
is defined by the rollup smart contract (i.e. an L1 portal contract can only send messages to its linked L2 contract)
The contents of messageData
are undefined at the protocol level. Constraint to size of NUM_BYTES_PER_LEAF. More data requires more messages.
The intended behaviour is for messages to represent instructions to execute L2/L1 functions.
This can be achieved by formatting the message payload to contain a hash of the following:
senderAddress
The senderAddress
is used if the function call must come from a designated address.
This is useful if a transaction writes multiple messages into the message box, where the associated functions must be executed in a specific order.
Error handling is delegated to the contract developer.
If a message triggers a function that has a failure case, this can be supported in one of 2 ways:
Notes:
When calling consumeMessage
, the portal contract derives the message data. For example, the typical pattern could produce a message which is a SHA256 hash of:
1. SHA256(calldata)
2. address of entity calling portal contract (if required)
In the above example, some messages do not specify a “from” parameter. These messages are linked to functions that can be called by any entity (e.g. the swap
function could be designed to be called by a bot; paying the bot some Eth to incentivize it to generate the transaction)
Notes:
If tx fails in unintended way (e.g. out of gas), L1 tx will be reverted and no messages are consumed. i.e. tx can be attempted again
Only UniPortal contract can trigger DaiPortal “deposit” due to message specifying UniPortal as the depositor. Enables tx composability.
Added into append-only data tree. A message leaf is a hash of the following:
name | type | description |
---|---|---|
contractAddress |
address | L2 address of contract Portal is linked to |
messageHash |
field | SHA256 hash of a byte buffer of size NUM_BYTES_PER_LEAF |
messageHash = SHA256(messageData)
. Hash performed by L1 contract.
messageData
is a buffer
of size NUM_BYTES_PER_LEAF.
The message leaf spec does not require messages are unique. This is left to the portal contract if they desire this property (e.g. portal contract can track a nonce).
A dynamic array with max size MAX_L1_CALLSTACK_DEPTH
Each call item contains:
name | type | description |
---|---|---|
portalAddress |
u32 | used to define message target |
chainId |
u32 | (needed if we want to go multichain) |
message |
sharedBuffer | message to be recorded |
The public inputs of a user-proof will contain a dynamic array of messages to be added, of size MAX_MESSAGESTACK_DEPTH
.
The portalAddress
parameter is supplied by the Kernel circuit and is stored in the circuit verification key.
The Kernel circuit will perform the following when processing a transaction:
portalAddress
)Nullifier logic is identical to handling regular state nullifiers.
Define the following storage vars:
pendingMessageQueue
: dynamic array of messages (FIFO queue)messageQueue
: dynamic array of messages (FIFO queue)addMessage(bytes memory message)
(function has no re-entrancy guard)
msg.sender
is a portal contractportalAddress
that maps to msg.sender
(message, portalAddress)
into a FIFO pendingMessageQueue
processRollup
MAX_MESSAGES_PROCESSED_PER_ROLLUP
from messageQueue
into the data treemessageQueue
pendingMessageQueue
onto messageQueue
pendingMessageQueue
Iterate over messageStack
provided by rollup public inputs.
Use mapping(address => bytes) messageBox
to log messages.
For each entry, messageBox[entry.portalAddress] = entry.message
(TODO: handle duplicate messages)
function consumeMessage(bytes message) public
If messageBox[msg.sender]
contains message
, delete message from messageBox
, otherwise throw an error
function assertMessageExists(bytes message) public
If messageBox[msg.sender]
does not contain message
, throw an error.
Rollup contract actions:
Concatenate all kernel circuit L1 message stacks into a monolithic L1 messageStack.
Monolithic messageStack has max size MAX_ROLLUP_L1_MESSAGESTACK_DEPTH
Sum of all monolithic callstack calldata is MAX_ROLLUP_L1_MESSAGESTACK_BYTES
The contents of messageStack
are assigned as public inputs of the rollup circuit.
For each message in L1’s messageQueue
array, perform the following:
contract
leaf from the contract treecontract.portalId == message.portalId
H(messageSeparator, contract.contractAddress, message.messageHash)
message.to
, message.value
. If nonzero, credit balances[to] += value
Output SHA256(messageQueue)
to attest to the messages added into the message tree.
Firstly, these diagrams are sensational.
Q1:
The diagram suggests a Public L2 Function cannot ‘write’ to the ‘L2 Message Box’. It’ll be useful for an Public L2 Function to be able to ‘call’ a Private L2 Function via some L2 message box. Further, it’d be useful if that message box can be written-to immediately, so that a tx within the same rollup (with an incremented user nonce to ensure the correct tx ordering) can execute the subsequent private tx which reads the message.
Q2:
When an L1 function makes an L1->L2 call, it’ll write to the L2 message box. This requires ‘work’ from the Sequencer of the next rollup (to actually write the message), so the Sequencer will need to be paid for this. How will the Sequencer be paid?
Suggestion: at the time of making the L1->L2 call, leave some L1 tokens in escrow for the Sequencer to collect if they successfully add the message. However, this might require an L1->L2 message tuple to include additional data field(s) which convey the fee being paid via L1.
Q3:
Related to Q2, when an L1->L2 call is made, and a message request is made (to ‘write’ a message to the L2 Message Box), will the Sequencer be forced to add the message in the next rollup, OR will the Sequencer have a choice over which messages to add, based on the fees being paid to them?
Q4:
Does the ordering of how messages are added to the L2 Message Box need to exactly match the ordering in which the original L1->L2 ‘message requests’ took place (FIFO), or will economics dictate the ordering? (And the same question for the other direction; L2->L1)
Q4.5:
Related to Q3&4, possible DoS attack. In a FIFO model, the L1->L2 message box only has so much capacity each rollup, because the Sequencer must execute a circuit which can only add so many new leaves to the message tree. DoS attack: if it’s cheap enough to send messages to the L1->L2 message box, someone could spam it, and no one else would be able to get their messages added. Basically, message box space is scarce, so might need to be bidded-for.
Q5 (not really a question)
Paying for L2->L1 messages. Clearly, an L2 tx which writes lots of L1 messages should pay more in L1 storage costs than a tx which doesn’t write any L1 messages. The cost of writing messages is denominated in ETH. So we still have the problem that when a user is estimating the fee they’ll need to pay to the Sequencer, the fee will need to consider variable ETH costs. No question really, just something to be aware of when estimating gas. We’ll already have the problem of needing to consider variable ETH costs for an L2->L1 call, since the number of nonzero commitments/nullifiers in ETH calldata will vary by user, and the ETH calldata costs of contract deployment will vary by contract.
In Part 3 (Technical Specification) it’s slightly unclear which subsections relate to L1->L2 calls, and which relate to L2->L1 calls. Might it be possible to clarify that in the document? Specifically, for any variable name which contains the word “message”, it might be clearer if it specifies whether it’s an L1->L2 message or an L2->L1 message. Lasse suggested the terms Inbox and Outbox. Although I forget in which direction those definitions should be applied!!
Edit: Inbox = into L2. Outbox = out of L2.
Thanks!
Q1:
A public<->private message box can be emulated by writing/reading notes into the append-only Merkle tree. I think it’s best to leave that as a language-level abstraction however, as L2<->L1 message processing requires the kernel circuit perform specialized logic that is not required for public<->private messaging (i.e. could be confusing to conflate the two).
I don’t think it is possible for a public function to write a message that a private function can act on in the same block. Let’s say tx A.pub
writes a message that B.priv
wants to read.
The sequencer is performing the Merkle tree insertions required to write A.pub
’s message. This occurs after the private kernel proof for B
has been computed (both A
and B
must be in the transaction pool for both txns to be included in a block.
In order for B.priv
to read the message that A.pub
wrote, the function must perform a Merkle membership proof of the message’s existence, which the prover cannot do as they do not know where in the tree the message will be written.
Q2:
Yes the transaction that instructs the Rollup contract to write a L1->L2 message will need to pay a fee.
The fee will be a fixed amount of gas, so we can compute this deterministically.
e.g. msg.gasPrice * GAS_COST_TO_WRITE_MESSAGE
. I think it’s ok to use the message sender’s msg.gasPrice
. There will be inevitable gas price changes between the time the user’s L1->L2 message txn is processed and the sequencer sends the rollup proof. However this can be arbitraged away via the sequencer token model (e.g. the amount a sequencer will bid to send a transaction will be lower iff they are taking a small loss processing messages due to gas price changes)
I don’t think we need to modify the message tuple to include additional data fields; the fee logic can be handled by the rollup contract and specced out as part of the fee model
Q3:
My initial thought was that the sequencer is forced to write a fixed number of messages in the next rollup (e.g. 100. If there are less than 100 they only write what is available. We need to define an upper bound as the rollup circuit logic is deterministic and cannot handle arbitrary numbers of messages).
We could include logic that skips over messages if the fee paid does not cover the rollup provider’s costs, but I’m hoping that isn’t necessary. Here’s the rationale:
Q4:
I think so. Ordering is important as executing a message may be conditional on executing older messages
Q4.5:
If message writers are paying the fair market value for writing messages, I think we can eliminate DDOS possibilities by having a very large upper bound on the number of messages processed by the rollup.
e.g. let’s assume we’re experiencing this DDOS. What’s the max number of messages a sequencer can process? We can assume all other rollup costs are minimal as the sequencer can create a rollup block with 0 transactions and only process messages.
This implies the max amount of messages a rollup block can process is linked to the block gas limit.
Which means that a DDOS attacker would need to fill an entire Ethereum block with message write operations in order to delay the Aztec Network by 1 block (assuming our message fee model requires the message writer to pay the fair market price of the gas cost for writing a message).
i.e. the cost to DDOS Aztec this way is approximately the same as the cost to DDOS Ethereum
If our maximum block production time is fixed as part of our sequencer selection protocol, this will have an effect. E.g. if the max Aztec block production rate is 1 per 100 seconds then the cost to DDOS Aztec would be ~0.1x the cost to DDOS Ethereum. That still seems very reasonable imo.
EDIT: On further thoughts I think this kind of attack means that we may need to have a way of skipping over messages that do not pay a sufficient fee.
For example if a DDOS attacker uses gas price volatility to underpay by 10% per message and fills a block of messages; the sequencer is paying a significant cost in order to process those messages.
However if messages can be skipped over due to low fees, this means that:
The latter point adds significant complexity to the protocol.
I’m curious about how other L2s handle this?
Perhaps the simplest solution is to still force the sequencer to process all messages, but we require the message writer overpays by a substantial amount (e.g. msg.gasPrice * 1.2 * GAS_COST_TO_WRITE_MESSAGE
) in order to protect against this?
EDIT2:
As part of this spec, I think we need to decide on whether message processing is required by the sequencer or whether there are any conditions under which a message can be skipped. If we decide on forced messaging, we can define how to appropriately meter fees in the fee specification and (more importantly) black-box fee payment while we develop this part of the protocol so that it’s not tightly coupled to messaging (i.e. we don’t add fee logic into message tuples, that information is passed in a separate parameter and processed independently)
Q5:
I think we can roll this into our fee model discussion. Every L2 tx will have an L2 fee component and an L1 fee component.
If we want users to pay fees in any token then the sequencer is required to take on FX risks and validate whether the tx is worth including in the block.
We will have to meter L2 and L1 costs separately for an L2 transaction. i.e. user submits an L1 gasPrice and an L2 gasPrice
. The gasPrice
values are denominated in a token of the tx sender’s choosing and the sequencer needs to validate whether this price is sufficient. i.e. regular fee market economics ought to apply here?
Re the Part 3 tech spec, will update.
Q1:
Lovely!
We could add some ‘chained transactions’ logic to allow commitments to be consumed within the same rollup. Chained transactions are tricky in a 2x2 rollup topology, because the number of comparisons needed in each layer of the ‘rollup tree’ doubles as you go towards the ‘root’ - so we’d need a circuit per level of the rollup tree (or a big circuit with lots of unused comparisons in the lower levels of the tree). Not terrible, definitely doable.
chained txs could be possible, but it adds protocol complexity to enable.
IMO any feature which increases protocol complexity in exchange for nice-to-haves isn’t something we can afford to do for V1.0
Q: does not having chained transactions make some potential A3 applications impossible to build? If the answer to this is “no” I don’t think we should add them.
Q: does not having chained transactions make some potential A3 applications impossible to build? If the answer to this is “no” I don’t think we should add them.
No, they just reduce latency.
Summarising a chat I had with Mike re: messages
It is important that we ensure that L1->L2 messages must be eventually processed by the sequencer. If this is not the case, stuck transactions become a possibility (e.g. the UniPortal example in the HackMD. If the “write” function call does not result in an L2 message being written, it is unclear how to “unstick” that tx).
Having a FIFO queue for L1->L2 messages is simplest from a protocol design.
Requiring a fee to be paid as part of the L1->L2 message write function is also simplest from a protocol design. i.e. we do not have a fee market for L1->L2 messages.
Questions that we need to resolve for the above to work:
We need to ensure that the message queue does not grow faster than the protocol’s ability to process L1->L2 message writes.
We want to design a system whereby the only way for the message queue to grow over time is if a significant (e.g. 10-20%) portion of Ethereum’s block space is occupied by transactions that write L1->L2 messages.
If the above holds, then we get an automatic fee market by requiring L1->L2 writes pay the Aztec rollup msg.gasPrice * GAS_COST_TO_WRITE_MESSAGE
.
i.e. the only way to grow the message queue over time is to occupy so much Ethereum block space with message write calls that the Ethereum fee market is effected by the demand to write messages
Followup thoughts on this:
If the cost to process L1->L2 messages is fixed, effectively L2 txns are subsidizing L1->L2 message calls.
e.g. for an Aztec rollup tx, if 50% of the L1 block gas limit is consumed processing L1->L2 message writes (i.e. popping the message off the L1 queue, validating it’s been added to the L2 tree, deleting the message from the L1 queue), this reduces the amount of L2 block space in the A3 block.
This makes A3 L2 txns more competitive as the supply of L2 txns has been reduced, increasing L2 fees.
Is this ok? It feels problematic, but so does introducing a fee market for processing L1->L2 message writes.
This post proposes to move closer to the Arbitrum model and have a couple of changes to the existing spec:
The proposed spec from hackmd requires that L1 and L2 contracts are linked in pairs to enforce access control on who can send messages to who. e.g., the target for a message passed by DaiContract on L2 would always be its L1 counterpart putting the burden of matching sender/recipient on the kernel circuit. Matching this is cumbersome and possibly expensive as it relies on L1 interactions.
As I see it, the use of the link is to enforce access control, which I don’t think should be put on the base layer but be implemented on the contract level instead. My suggestion is that we remove the idea of “portal contracts”, and instead support generic messaging similar to Arbitrum.
[!info]
To keep usage simple and somewhat independent of the direction, I suggest we consider having a “special” base layer public contract with a public functionsendMessageToL1()
, which is used for sending messages to L1 from inside the rollup. From the developer’s point of view, the ergonomics of sending “cross-chain” messages would be easier to understand.
The generic messages should contain:
from
- The sender of the message, for L1 → L2 interaction, would be an Ethereum address of the account that called sendMessageToL2()
. For L2 → L1 the AztecAddress of the account calling sendMessageToL1()
target
- The address of the recipient of the message. As before, it will differ between the Ethereum address or Aztec address depending on the destination chain.caller
- The address of a restricted caller, or address 0 if allowing anyone to consumedata
- The message contents, the data to be transmitted. For L2 → L1 calls, this could be calldata that the targets is called with.When calling the sendMessageToX()
functions, the message should be emitted as an event on the “local domain”, and a hash (tbd which one) is added at the next index of the messageQueue
. The rollup should be forced to add those cross-domain commit messages to work around censorship cases. Supporting a constant number of insertions make it fit nicely into an implementation where we include a Merkle root for a small rollup-message-tree
and pass the root along instead. For L2 → L1 calls, this should also make the cost for the rollup contract constant as it would insert a root independent on the number of messages. For L1 → L2 calls, the rollup contract would need to perform reads, or have a preperation step where a hash is computed which is passed into the rollup instead.
Dependent on the hash function, the merkle inclusion and message commits might be expensive to perform on L1, but we might find a decent tradeoff that don’t make it explode, or accept that L2 → L1 calls are expensive. Examples of contracts that would be expected to perform “many” L2 → L1 calls is a token bridge. While a naive token-bridge implementation might perform an individual L2 → L1 call for every exit it should be possible to make a “batch” on the L2 contract, and then fire that once in a while to perform a “mass-exit” with just one L2 → L1 message so seems like something that can be worked around at the contract level.
To consume and execute a message, an actor can go to the Message box (outbox) and execute the comsume()
function; here she needs to pass the message, the index in the queue and a Merkle inclusion proof for the message for the given rollup block. With the fixed-size trees, we can compute the specific rollup id just from the index of the message.
By requiring the consume()
call to enforce the caller
we support similar functionality to having the L1 ↔ L2 contract link and can do the same bundling as the prior design.
[!info]
Usingaddress
for Aztec addresses in the snippet below.
contract Outbox {
// address(MAX) between transactions
address public l2Sender;
// Inspired by `Outbox::executeTransaction()` from Arbitrum Nitro
function consume(Message _msg, uint256 _id, MerkleProof _proof) public {
// check that correct caller
require(_msg.caller == address(0) || _msg.caller == msg.sender);
require(!isConsumed(_id));
// Checks that _msg is really in there
_validate(_msg, _id, _proof);
// Set the _id as consumed
_consume(_id);
address prevL2Sender = l2Sender;
l2Sender = _msg.from;
// Executes the message, (success, rd) = _to.call(data);
_execute(_msg);
l2Sender = prevL2Sender;
}
}
To make a contract on L1 that is then controlled by L2 access control is pretty simple and behaves quite close to standard access control.
contract L1ContractControlledByL2 {
mapping(address => bool) public controllers;
modifier onlyController() {
require(msg.sender == Outbox, 'Not called from outbox');
require(controllers[Outbox.l2Sender()], 'Not called by controller');
_;
}
function sendMoney(...) public onlyController() {
...
}
}
In your consume
example, is that an L1 contract? Is the MerkleProof validated on L1 or L2?
I’d also like to validate that whether this proposal is a simplification. I don’t think the portal contract concept is particularly difficult to use as it is merely an L1 extension of an existing L2 contract. It allows an Aztec contract to define its interface across both L1 and L2, which I think has significant utility
On L1, the Merkle proof part was mainly if we wish to limit the gas impact that the L2 → L1 messages had on the rollup execution and push the “bill” on later execution. If we don’t wish to limit it this way, we could insert the message hash in a mapping directly, e.g., mapping(id => hash)
and throw away the Merkle part. This just put the gas burden on the rollup execution to insert messages into the mapping.
Ok. I think Merkle proofs are a non-starter due to their extreme cost. It’s ok for transactions to incur fixed L1 costs that are paid for in the L2 tx.
Will try to summarise my other thoughts here:
Even with this change, a large subset of contracts will need portal contracts: any contract that is a custodian of tokens for example. This means we will still have to provide tooling that deploys and links portal contracts. So…does this spec change even solve the problem it’s trying to solve?
I think that, in a vacuum, a push model is fundamentally worse than a pull model as L1 transactions no longer follow traditional semantics. An L1 tx that consumes a sequence of messages also seems very unpleasant as now there is this id
field that must be propagated. I don’t understand how this does not add extreme complexity. Can you elucidate via a flow-chart how an Aztec Connect style defi interaction would work under this model, in the same manner as in the HackMD doc? (including the shielded token contracts)
The complexity of writing L2 contracts increases as the contract author can no longer trust messages are coming from honest parties (i.e. the portal contract). This requires extra validation logic in any L2 function that consumes L1 messages.
I need to think further on this when I can find the time, but my initial suspicion is that this change could create more problems than it solves.