Polycirc: Differentiable arithmetic circuits for ZKML
I’ve just finished the first release of polycirc: a library for
building differentiable arithmetic circuits for Zero-Knowledge ML.
The basic idea is that you can build an arithmetic circuit c, then polycirc
will transform it into another circuit d computing a single step of gradient
descent.
You can then use d with whatever zero-knowledge backend you like!
This post is a quick rundown of what polycirc can do; I’ll include pointers to more in-depth docs in each section.
Arithmetic Circuits
Arithmetic circuits are built from basic operations like addition,
multiplication, and constants. Here’s a simple example, representing the
expression x₀ * (-x₁):
x₀ ───────────┐ ┌───┐
└───┤ │
│ * ├──── x₀ * (-x₁)
┌───┐ ┌───┤ │
x₁ ───┤ - ├───┘ └───┘
└───┘
Think of each wire as carrying a value in some semiring (e.g., \(\mathbb{N}\)). Circuits are “differentiable”, which means that polycirc can transform the circuit above into an “optic”, which is a circuit of this general shape:
Note that some wires are “bent backwards”: we use this to model the backwards information flow of the gradients of the forward circuit. However, this means that optics aren’t functions - polycirc’s learner module lets us turn these optics into a function we can use for gradient descent.
Building a circuit
The main thing that polycirc does is represent circuits. The docs describe how to build circuits, but in short, we can build our original example like this:
from polycirc import ir
circuit = (ir.identity(1) @ ir.negate(1)) >> ir.mul(1)
# Builds this circuit:
#
# x₀ ───────────┐ ┌───┐
# └───┤ │
# │ * ├──── x₀ * (-x₁)
# ┌───┐ ┌───┤ │
# x₁ ───┤ - ├───┘ └───┘
# └───┘Circuits are built by combining smaller circuits with sequential >> and
parallel @ composition.
Sequential composition f >> g means “plug the outputs of f into the inputs of g”:
┌───┐ ┌───┐
───┤ f ├───┤ g ├───
└───┘ └───┘
While parallel composition f @ g runs f and g in parallel:
┌───┐
───┤ f ├──
└───┘
┌───┐
───┤ g ├───
└───┘
Now we’ve built our circuit, we can compile it to python source code and see that it computes what we expect:
from polycirc.ast import diagram_to_ast
print(diagram_to_ast(circuit, 'multiply_negate'))Which prints this python code:
def multiply_negate(x0, x1):
x2 = - (x1)
x3 = x0 * x2
return [x3]Zero-Knowledge ML
Now that we can build and compile circuits, we’d also like to differentiate them to train machine learning models. For an end-to-end example of training a linear model, see this example; I’ll sketch the details here.
Let’s start by defining our model circuit, then turn it into an “optic” using
rdiff1:
# mat_mul is like a single (linear) layer neural network
model_circuit = ir.mat_mul(OUTPUT_SIZE, INPUT_SIZE)
r_model = rdiff(model_circuit)Remember that r_model is an optic whose backwards information flow is the
gradients of the model, so we need to do two things:
- Use the gradients to update our model parameters
- Get a function which does a single step of gradient descent
The make_learner function
does both of these steps for us:
u = fixed_gd(ONE >> 7)(NPARAM) # learning rate of 1/2^7
d = cliploss(OUTPUT_SIZE) # mean squared error loss
step_circuit = learner.make_learner(f, u, d, NPARAM, INPUT_SIZE)
Here, the u and d maps are the “update” and “displacement” maps, which serve
the same purpose as the “optimiser” and “loss” function.
We can then compile step_circuit and use it for gradient descent:
step = diagram_to_function(step_circuit)
for i in range(0, NUM_ITER):
# parameter update step
p[i+1] = step(*p[i], *x[i], *y[i])[OUTPUT_SIZE:OUTPUT_SIZE+NPARAM]
A quick final note: step actually returns more than just new parameters, which
is why we have to take a slice of its outputs.
Its type is actually step : p + x + y → y + p + x: the first y outputs are
actually the model’s prediction using the old parameters.
Summary
Hopefully that gave a quick sense of what polycirc is for. If you’re interested in using it, feel free to drop by the discord or ping me on twitter!