A Pathway to More Efficient Proving Jobs
Summary: Builtins optimize the proving process. However, each proof is computed with respect to a layout, and layouts which are inefficient for a particular proving job diminish the benefit of builtins. Currently, there is a small static list of layouts, and each proof is computed with respect to the most suitable layout from that list. The static approach has two main drawbacks. First, the limited variety of layouts is inefficient for most proving jobs, incurring unnecessary costs on users through a convoluted fee mechanism. Second, manual maintenance of the list to accommodate new builtins becomes wildly more difficult for a large set of builtins, practically impeding the proving process from supporting many builtins with any semblance of efficiency. To resolve these issues, we are working on a Dynamic Layout system in which a layout is tailored to each proving job.
The Cairo stack facilitates provable general computation by compiling Cairo code into instructions for a STARK-friendly CPU architecture: the Cairo VM (henceforth CVM). The many advantages of a general purpose CPU come at an inherent cost: the CVM is not optimized for many frequently used operations. Prominent examples of such operations are hash functions e.g Keccak, Pedersen, and Poseidon, with other examples including elliptic curve operations and range checks (i.e checking if a particular number lies in a particular range of values).
To resolve the relative inefficiency of the CVM, the Cairo stack introduced the concept of builtins for key operations: plugins that optimize those operations’ proving complexity. Builtins are analogous to ASICs: ASICs are application-specific integrated circuits while builtins are application-specific algebraic constraints (AIRs). If you don’t know or remember what AIRs are, we’ll briefly touch on them later on; more details appear in this introductory medium post.
In a nutshell, proof complexity is correlated (roughly linearly) to a resource called trace cells, and builtins streamline proofs of particular operations by using far fewer trace cells than the Cairo VM.
Having explained the benefit from builtins, it seems natural to develop them for many frequently used operations. This is easier said than done. The current process of introducing new builtins into Starknet involves several steps:
- Write the AIR
- Integrate into the prover by creating new layouts (explained below)
- Integrate into Starknet, i.e modify its codebase and developer tooling to use the new builtin
Beyond the challenge of writing an AIR there is much room for improvement in the remaining two stages. This high-level post will go into more detail about builtins as application-specific AIRs, the above issues, and our plans for the upcoming future.
Builtins: application-specific AIRs
AIR is an acronym for Algebraic Intermediate Representation. In our context, it is a system of polynomials which represents a virtual machine. For instance, Cairo derives its name from CPU AIR: a system of polynomials which represents a particular CPU architecture. A solution of this polynomial system represents a valid state transition, and is called a valid algebraic execution trace (AET).
STARKs prove the correct operation of a virtual machine by proving that the associated execution trace is valid with respect to the given AIR. Roughly speaking, execution traces are organized as tables of numbers, and the STARK protocol proves these numbers jointly solve a polynomial system.
There are many ways to compute the same thing, with some computations more efficient than others. In our context, proof complexity is essentially determined by the trace size, i.e the number of trace cells in the table. Since traces are generated with respect to an AIR, an application-specific AIR can greatly reduce the execution trace of a particular computation. Builtins are application-specific AIRs aiming for precisely this optimization.
Here’s a table that conveys the efficiency improvements of particular builtins (all in production).
Trace layouts: present and future
Recall an algebraic execution trace (AET) is (roughly speaking) a table of numbers that encodes a sequence of VM steps (i.e an execution of a program). To compute a proof, the prover executes the STARK protocol on an execution trace with respect to an AIR.
Above, we introduced builtins as application-specific AIRs that aim to minimize proof complexity by reducing the number of trace cells required to encode a computation. However, careless integration of builtins into Starknet can waste many trace cells, diminishing the intended benefits. Let’s elaborate.
In a nutshell, a trace layout is an allocation of trace cells to different “components”. In our context, these components are the CVM and the builtins. More specifically, a layout specifies the relative amount of trace cells allocated to each component. (Layouts are always structured to streamline proof verification. For more information, see the “succinctness” section in this medium article.)
Here’s the crucial point: proof complexity depends on the total number of trace cells allocated by the layout, which may be larger than necessary. For example, to prove a sequence of CVM steps, it is roughly doubly efficient to use a layout that only allocates trace cells to the CVM component compared to a layout that allocates half the trace cells to e.g the Poseidon builtin. In summary, the complexity of proving a particular computation can be greatly reduced with a suitable layout.
Currently, there is a manually maintained list of layouts, which gradually grows for two main reasons:
- Builtins can only be used in layouts which allocate them trace cells. Hence the addition of a builtin requires at least one new layout.
- Tailoring a layout to an execution of Cairo code can improve cell allocation. Hence optimizations of cell utilization often require new layouts.
The code for both prover and verifier (the Solidity and Cairo verifiers) is configured according to the list of layouts.
With the addition of the Keccak and Poseidon builtins, we found it increasingly difficult to generate a small list of layouts which include many builtins while remaining efficient for the execution of most Starknet blocks. Furthermore, this efficiency is expected to drop wildly with the introduction of additional builtins, since layouts would have to account for the many possible combinations and ratios amongst them.
We are currently working to improve our system and forgo a prespecified list of layouts in favor of “Dynamic Layouts”: tailored on-the-fly to every execution of Cairo code. Dynamic Layouts will always be optimally proportioned for the proving job at hand. From an engineering perspective, this requires a considerable change in the codebase to support dynamic inputs. However, we expect Dynamic Layouts to streamline Starknet’s proving layer by improving trace cell utilization and consequently making better use of the prover machine.
Dynamic Layouts will dismiss the challenges of manually maintaining efficient layouts for many builtins, consequently streamlining the process of integrating more builtins into Starknet.
Dynamic Layouts and fees
One of the goals of transaction fees is to charge users for the marginal costs incurred on the protocol from their transactions. Since the units of transaction fees are currency, the fee mechanism involves conversion from resources (e.g VM steps, builtins, calldata, Ethereum gas) to tokens (e.g STRK, ETH).
Currently, the wasteful resource utilization of the prover comes at the expense of the users since they are charged fees according to the total trace, and not its utilized portion. Dynamic Layouts will improve cell utilization and consequently reduce the “unnecessary” part of the transaction fee (the part covering resource consumption not directly incurred by the user’s transaction).
Starknet builtin integration
Currently, the final step in builtin integration involves modifying the Starknet codebase to actually employ them. These modifications are orthogonal to layouts: they are required to ensure the Starknet operating system invokes builtins when possible. For instance, we want the Starknet operating system to invoke the Poseidon builtin during execution of Cairo code that invokes the Poseidon hash function.
Starknet builtin support is supported manually now, similarly to layouts. However, unlike the case of layouts, this manual support will remain tame and straightforward even for many builtins. In other words, Starknet builtin support is not a blocker to integration, and Dynamic Layouts will truly pave the way for creation and integration of additional builtins.
In this post we explained what builtins are, their benefits, the challenges involved in reaping these benefits, and our plans. Our current focus is on Dynamic Layouts, which will improve not only the efficiency of the proving process, but also the ease of integrating new builtins.