High level: protstar/galaxy with Honk will increase Prover RAM consumption by over 2x vs non-folded Honk. What to do?

Overview: Protostar/galaxy allows us to fold a Honk instance into an “accumulator” instance.

The accumulator instance represents a relaxed randomised representation of Honk.

A consequence of this is that the accumulator instance polynomials are all random. This risks blowing up memory consumption vs regular instances.

# RAM costs for regular Honk

Many precomputed Honk selector polynomials are *structured* i.e. their values are conssitently small.

e.g. the selectors `q_arith, q_sort, q_plookup, q_ecc, q_aux`

all take Boolean values. The 8 permutation selectors (`S1, ..., S4`

, `\sigma1, ..., \sigma4`

) all take 32-bit values.

The above 13 selectors can be stored in `33n`

bytes of RAM (n = circuit size).

# RAM problems with the randomised accumulator

in a randomised accumulator relation, the above polynomials all take 32-byte values, and requires `416n`

bytes, a 12x increase.

For a size 2^20 circuit this is 416MB.

Combined with the requirement to represent the randomised accumulator instance and the instance being folded both in RAM at the same time, using a folding scheme increases the RAM requirements of our Prover by over 2x.

# Potential Improvements/Solutions

We can define a (quite large) circuit of a fixed depth where all gate selectors are the same for every folded instance. For example, the first 512k gates have `q_arith = 1`

, the next 256k gates have `q_sort = 1`

, the next 128k gates have `q_aux = 1`

etc etc.

The actual circuit description for a given instance is defined entirely via the copy constraints.

This structure would allow us to not need to store the full representation of the 5 arithmetic gate selectors in memory.

This structure would also reduce Prover time of the folding IOP, due to restricting the number rows each relation is active over for the randomised accumulator instance (this requires its own thread, lots of detail to go into)

There is also the painful option of not using folding at all (instead full recursion + goblin plonk). But this would greatly increase the number of gates of our recursive verifier and also greatly increase Prover costs: with folding the Prover does not need to produce an opening proof at each fold = 2.5n scalar muls with ZeroMorph.