[Proposal] Spending notes which haven't-yet been inserted

Thanks to Joe for coming up with this concept. Related doc from Joe: https://hackmd.io/c1L9SP7ORgyknjHBaKTy2A

Summary

We need a way of reading a note from the append-only tree before it has been inserted into the tree by the sequencer.

This discussion focusses on being able to read such ‘pending’ notes in subsequent iterations of the kernel circuit. I.e. if a later nested function call wishes to read a note which was created as a new_commitment in a previous function call (in some previous iteration of the kernel circuit).

This will allow use cases where:

  • A private function calls itself recursively
  • contract A calls contract B which calls contract A

Important Note

There’s a related post here on whether (and how) to remove merkle membership checks from an app circuit.

If we adopt the suggestion in that post (of deferring ‘reads’ to the kernel circuit), it makes the interfaces much cleaner when trying to also solve this problem.

And so… in everything that follows, I’m assuming we’ll perform all merkle membership checks in the kernel circuit, by passing read_request data to the kernel circuit from the app circuit.

Updating the PublicInputs struct for an app’s private circuit:

The newly required field is shown with // NEW

type PrivateCircuitPublicInputs = {
    call_context: CallContext,
    
    args: Array<fr, ARGS_LENGTH>,
    return_values: Array<fr, RETURN_VALUES_LENGTH>,
    
    emitted_events: Array<fr, EMITTED_EVENTS_LENGTH>,
    
// NEW
    read_requests: Array<fr, READ_REQUESTS_LENGTH>,

    new_commitments: Array<fr, NEW_COMMITMENTS_LENGTH>,
    new_nullifiers: Array<fr, NEW_NULLIFIERS_LENGTH>,
    
    private_call_stack: Array<fr, PRIVATE_CALL_STACK_LENGTH>,
    public_call_stack: Array<fr, PUBLIC_CALL_STACK_LENGTH>,
    l1_msg_stack: Array<fr, L!_MSG_STACK_LENGTH>,
    
    historic_private_data_tree_root: fr,
    
    contract_deployment_data: ContractDeploymentData,
}

We add a read_requests field, which is an array of commitments which have been opened (read) within the app circuit. Noir would push any commitments which have been opened to this array.

Updating the Private Kernel’s Inputs and PublicInputs structs:

Note: the PrivateKernelPublicInputs don’t need to be modified, because they already have a constant field for the old private data tree root.

The PrivateKernelInputs has a new field called read_request_membership_witnesses: an array of membership witnesses (a leaf_index, sibling_path tuple) – one for each read_request in the app circuit’s public inputs.

type PrivateKernelInputs = {
    signed_tx_request: SignedTxRequest;

    previous_kernel: PreviousKernelData,

    private_call: PrivateCallData, // CHANGE 
}

type PreviousKernelData = {
    public_inputs: PrivateKernelPublicInputs,
    proof: Proof,
    vk: VK,
    vk_index: uint32,
    vk_sibling_path: Array<fr, KERNEL_VK_TREE_HEIGHT>,
}

type PrivateCallData = {
    call_stack_item: PrivateCallStackItem, // CHANGE (see app circuit PublicInputs above)
    
    private_call_stack_preimages: Array<PrivateCallStackItem, PRIVATE_CALL_STACK_LENGTH>,
    
    proof: Proof,
    vk: VK,
   
    function_leaf_membership_witness: MembershipWitness<FUNCTION_TREE_HEIGHT>,
    contract_leaf_membership_witness: MembershipWitness<CONTRACT_TREE_HEIGHT>,

// NEW
    read_request_membership_witnesses: Array<MembershipWitness<PRIVATE_DATA_TREE_HEIGHT>, READ_REQUEST_LENGTH>
    
    portal_contract_address: eth_address,
}

Changes to Noir circuit logic

The get() method will no-longer try to do a merkle membership check. It will also no-longer populate the historic_private_data_tree_root (because it’s not doing a membership check anymore!).

SPIKE: do we even need this historic_private_data_tree_root field anymore in the app circuit? Are there any use cases which need to read from the tree, without passing a read request to the kernel circuit?

Kernel circuit logic

The Kernel circuit will need to loop through each read_request, and use the corresponding read_request_membership_witness to compute the root of the private data tree. This root must be the same for every iteration of this loop. The root can then be set equal to the private_kernel_public_inputs.constant_data.old_tree_roots.private_data_tree_root.

The crux of this entire post is that once we have all of the above in place, the kernel circuit can perform a conditional check:

  • If the read_request_membership_witness is identified to be nonsense (we could, for example identify an all-zero sibling path, or a leaf index which is -1), we know this is an “optimistic” read request:
    • Loop back through all new_commitments in private_kernel_inputs.previous_kernel.public_inputs.end.new_commitments.
      • If the read_request value (which is itself a commitment) matches one of those new_commitments (which are basically pending insertion), then the read_request is STILL valid.
        • So ignore the root which was calculated for that read_request.
        • Delete the new_commitment from the new_commitments array (TODO: think of how to do this efficiently).
        • Delete the nullifier which corresponds to this read_request (because we just deleted the new_commitment that this nullifier nullifies!)*

* Note: we would only know which nullifier corresponds to a particular read_request, if they occupy the same index in their respective arrays. Now, not every read will be accompanied by a nullifier (e.g. when reading a const note), so what do we do in those cases? We could have a zero nullifier entry in the nullifiers array, but that messes with our current in-circuit array push/pop logic (which views 0 as “end of array”).

Maybe (and it pains me to suggest this, because it’s ugly), we need another array (in the app circuit Public Inputs ABI) of the same length as new_nullifiers, which contains index pointers to the read_requests array. I.e. a way of saying “this nullifier entry corresponds to this read request entry”.

Aztec RPC Client logic

The Simulator no-longer needs to fetch membership witnesses for get oracle calls.

For a particular function’s public inputs, the Client now needs to read the read_requests and fetch membership witnesses from the DB. It can identify whether a read_request is “optimistic” (and hence that the membership witness can be dummy) by looping through the new_commitments of all earlier function calls in this tx.

7 Likes

I think you could also have a nullifier that isn’t nullifying a read_request in case the commitment is hardcoded in the contract. For example if you make a circuit that does something if you know the preimage of a hardcoded hash (only once).

Or maybe more generally, a contract can use a nullifier to avoid doing any operation more than once, and that can be unrelated with the private data tree.

That would also require, not only pushing an empty nullifier for const notes, but also pushing an empty read_request for nullifiers that aren’t related to private data, If we wanted to go the route of making those arrays parallel to link those up.

12 Likes

Super high level comments -
Am I right to think that one of the UX improvements here that once could use pending nullifiers when doing a multicall that affects the same storage slot? (like imagine approving a contract for 30 DAI, contract does transferFrom, then approval is set back to 0)

A private function calls itself recursively

Does this pending nullifier add any help towards re-entrancy attacks?

13 Likes

Ah, yes that’s a good point. Another example is ‘singleton’ UTXOs; there’s a notion of an initialisation_nullifier in some of the docs I’ve written, which just prevents a particular note from being cheekily re-initialised.

This might be a nice generalisation, thanks.

So we can have read_requests without nullifiers. And we can have nullifiers without read_requests. But sometimes a read_request will correspond to a nullifier.

In light of that, it’s probably tidiest to introduce another array which points each read_request entry to a nullifier index.

Populating the arrays in parallel, so the indices are lined-up, now feels fiddly to achieve. It would mean every compute_nullifier() call would need to push a value (possibly 0) to the read_requests array, which feels like an incorrect separation of concerns.

6 Likes

Yep, exactly.

This proposal basically lets you edit the same storage slot in distinct function calls, within the same tx.

Does this pending nullifier add any help towards re-entrancy attacks?

I don’t think so - I think those can be avoided by writing safe logic in the Noir contract.

This proposal suffers from the same problems as we’re seeing for nested public functions: i.e. identifying the correct ordering of state changes across function calls.

With public functions, the problem is that the kernel circuit needs more information, in order to be sure it can ‘squash’ state read and state update requests.

With private functions (as is being discussed in this thread), the kernel circuit also needs more information, in order to be sure it can ‘cancel-out’ (i.e. deleting) pending nullifiers and pending commitments (the act of deleting this information from the kernel’s arrays is analagous to squashing public state update requests). It’s the exact same problem. So we probably need counters in a similar way as @spalladino has propopsed in the other thread.

I don’t think so - I think those can be avoided by writing safe logic in the Noir contract.

I’m not fully sure about that. When you start having multiple contracts interact with each other and using external code it can get more muddy. For example, it can be very useful to have a mutex (lock) that can be used to handle re-entries in the private domain when you have a contract that will interact with untrusted contracts. The most well-known example where this is used is Uniswap. Uniswap is made to work with tokens that are not added to a whitelist etc, just any token thereby also untrusted code, meaning that it cannot assume that external calls behave as they are intended to. Therefore they opt to use a lock, ensuring that it cannot reenter and mess up accounting.

An example of cases where this has been an issue can be seen on the Gnosis chain where assets had an onTokenTransfer() function that was executed as part of the transfer (see here). For contracts with multiple token transfers in the same scope, they could be used to re-enter somewhere in the middle, and things can easily get ugly.

@Maddiaa and I discussed transient storage (storage that only lives in the tx). With transient storage that allows contracts to read other contracts’ transient storage but only write to their own, you can also use it to communicate data between contracts A and B even if they are not directly calling each other but both are called in the same transaction.

I’m not fully sure about that

A mutex would be an example of “writing safe logic in the Noir contract”, to avoid re-entrancy. So I think we’re on the same page?

transient storage

Sounds cool, but complicated :slight_smile:
Would there be bad consequences if a contract could read another contract’s storage directly?

A mutex would be an example of “writing safe logic in the Noir contract”, to avoid re-entrancy. So I think we’re on the same page?

From your earlier message, I read it like “You don’t need re-entrancy guards if you follow checks-effects-interactions”.

Mutexes get a bit weird in the private domain.
Normally, you would go from unlocked → locked → unlocked, that is 2 updates → so 2 nullifiers + 2 data entries? When these updates are practically useless for everyone but you in that exact transaction execution, that is a lot of state bloat and wasted calldata :sob:.
Also if you are using a “global” lock two transactions cannot update the lock in the same block as nullifiers would collide so you would be back to 1 tx per block for the contract. If you are making a mutex per account (say a mapping) then you have issues for cross-contract re-entrancies as that might change the sender, so even if reentering, not spotted as it is not “the same dude re-entering”.

Would there be bad consequences if a contract could read another contract’s storage directly?

This should already be possible, just like you have getters in solidity etc you do it here. We outlined some of this in our dog-fooding noir contracts which will be discussed one of the coming days :eyes:

The issue here would be that we might be inserting useless data into the data tree and nullifying when the data never really needed to go on chain because it was transient. Bloating State :frowning:.

@Mike in your description of how read requests would be processed in the private kernel you say:

The root can then be set equal to the private_kernel_public_inputs.constant_data.old_tree_roots.private_data_tree_root.

What exactly do you mean here? What root is being set?

I am implementing read requests here for this issue. I need to wrap some things up and I need to put off computation of the membership witnesses until later on (in the RPC client instead of where I have it now in the simulator), but other than that I am mostly done. Then we can proceed with the pending commitments task which is the crux of this thread.

Any thoughts/updates on this TODO here?

Edit: Please keep in mind that for now I am implementing read requests with membership checks in the kernel without support for spending pending commitments.

I dug into this quite a bit and considered how this ties into whether or not the app-circuit still needs a historic private data root.

Here is a solution I came up with (implemented here - init to 0 here):

  1. The initial kernel intializes public_inputs.constantshistoric_private_data_tree_root to 0.
  2. Each kernel iteration checks if there are any read_requests to process
  3. If there are, check if historic_private_data_tree_root is still 0
  4. If it is still 0, set historic_private_data_tree_root to the root computed for this iteration’s first read_request.
    • Note: this read_requests[0] is the first read request in the entire tx so far!
    • We use read_request_membership_witnesses[0] to compute this root
  5. Finally, iterate over the remaining read_requests for the current kernel iteration and ensure their witnesses match the historic root, which is now initialized.

This approach does not require the app circuit to reference/output a historic private data root. It ensures that every read request in the kernel uses the same root, and outputs that root via the kernel’s public_inputs.constants.

One issue with this approach, is that if a transaction makes zero read requests, the final kernel iteration will output 0 for the historic private data root in public_inputs.constants. We can modify rollup circuits to allow this, but it leaks whether or not a private tx contains read requests.

If this is unacceptable, we can consider one of the following options:

  1. Add a historic root to the final kernel iteration’s private inputs which is matched against the root in public_inputs.constants (or used to initialize if it is still 0 at the end of the final iteration).
  2. Or we can keep the historic root in the app circuit’s outputs (public inputs), use it to initialize the kernel’s public_inputs.constants in the initial iteration, check that every read request matches it, and check that every subsequent kernel iteration’s public_inputs.constantshistoric_private_data_tree_root matches the app circuit’s outputs.

I think I mean the root of the private data tree can be computed inside the kernel circuit, from the sibling path, index, and value, and then the kernel circuit can set_public that root value at the appropriate element of the kernel circuit’s public inputs (i.e. private_kernel_public_inputs.constant_data.old_tree_roots.private_data_tree_root). This will act as the kernel circuit’s attestation (to the base rollup circuit) of the root that’s been used for reads.

Edit: I’ve just read your more detailed reply. I think what you suggest is a great approach. And for the concern you raise (that a 0 root leaks that a read request hasn’t been made), option 1 sounds good to me! I don’t think we need a separate ‘final’ kernel circuit, though. We can probably use the same ‘subsequent kernel’ circuit for all non-initial iterations – we’ll just need a little bit of branching logic, so that passing 0 can be overwritten by a proper read, and passing some actual root value can be constrained to equal proper reads.

Nope - something to think about (and not relevant for simulation). Naively, using turbo constraints, it’ll be a very inefficient algorithm to delete something from the middle of an array, and shift the other elements left. Maybe with ultraplonk ROM and RAM tables it can be done more efficiently. That’s something we should consider post sandbox launch.

1 Like

Progress with read requests

Read requests are now output from the App circuit and checked in the kernel (issue & PR w/ follow up tasks).

A read_request may reference a commitment created in a later iteration

@suyash67, @jean and I believe we discovered an issue with the implementation outlined in this post.

Consider this example:

  1. A::foo calls A::bar
  2. A::bar creates commitment C
  3. A::bar returns to A::foo which then reads C

Here, the initial kernel iteration will be for foo which reads C, but C won’t appear in new_commitments until the following kernel iteration for bar.

This means that a read_request in one kernel may reference a pending commitment that is not present in new_commitments until a later kernel iteration. While new_commitments get forwarded from one kernel iteration to the next, read_requests do not (at least in the current implementation).

Solution 1

Rework the private kernel to have one iteration per “frame”, where a frame ends when a nested call is made or returned from. Originally discussed here.

Solution 2

Modify the current kernel implementation to forward all unmatched read_requests (those that read a “pending” new_commitment but have not been matched to one yet) to the next kernel iteration. Each kernel can then check its own read_requests as well as the previous kernel’s against all new_commitments seen thus far.

This would require us to add read_requests to the private kernel’s public inputs. We would then we need to constrain that there be 0 unmatched read requests at the end of the final iteration. Ideally the last iteration should not even have read_requests in its public inputs. This would require a separate private_kernel_circuit_final with different public inputs.

Wondering if anyone else has ideas or if anyone has considered this?

1 Like

Doesn’t this suffer from the same issue as ordering public state updates, where A::foo could read C before even calling into A::bar?

Yeah, it’s the same issues.
This is what I was alluding to with this earlier comment: [Proposal] Spending notes which haven't-yet been inserted - #6 by Mike - perhaps I could have been clearer.
We can hopefully solve both the private and public versions in a consistent way.

Doesn’t this suffer from the same issue as ordering public state updates, where A::foo could read C before even calling into A::bar?

Yes it does. I’ll be starting a new thread soon on ordering of private state.

Yeah, it’s the same issues.
This is what I was alluding to with this earlier comment: [Proposal] Spending notes which haven’t-yet been inserted - #6 by Mike - perhaps I could have been clearer.
We can hopefully solve both the private and public versions in a consistent way.

We have been discussing state ordering quite a bit. But @suyash67, @jean and I only recently realized it may require us to forward read requests to later kernel iterations (alongside new commitments and nullifiers which are already forwarded). I don’t think counters are sufficient without this. Or we could have kernel iterations per-frame like mentioned in the other thread.

@Mike and/or @spalladino do you agree that if we take the counter approach, read requests need to be forwarded to later kernel iterations too enable reading of pending commitments?

Agree, yep. Also, I’d consider moving the check of read requests being empty to the base rollup circuit, so we don’t need to introduce a new private_kernel_circuit_final.

Agree, yep. Also, I’d consider moving the check of read requests being empty to the base rollup circuit, so we don’t need to introduce a new private_kernel_circuit_final.

Would forwarding read requests to the base rollup circuit like this lead to a more restrictive limit on the number of read requests in a private call?

Right now READ_REQUESTS_LENGTH is very low, but I think we could theoretically bump it much higher than NEW_COMMITMENTS_LENGTH and NEW_NULLIFIERS_LENGTH if it is not a public input. I could be wrong, and also maybe we don’t really care to have a higher RR_LENGTH, not sure.