Polycirc: Differentiable arithmetic circuits for ZKML
I’ve just finished the first release of polycirc: a library for
building differentiable arithmetic circuits for Zero-Knowledge ML.
The basic idea is that you can build an arithmetic circuit c
, then polycirc
will transform it into another circuit d
computing a single step of gradient
descent.
You can then use d
with whatever zero-knowledge backend you like!
This post is a quick rundown of what polycirc can do; I’ll include pointers to more in-depth docs in each section.
Arithmetic Circuits
Arithmetic circuits are built from basic operations like addition,
multiplication, and constants. Here’s a simple example, representing the
expression x₀ * (-x₁)
:
x₀ ───────────┐ ┌───┐
└───┤ │
│ * ├──── x₀ * (-x₁)
┌───┐ ┌───┤ │
x₁ ───┤ - ├───┘ └───┘
└───┘
Think of each wire as carrying a value in some semiring (e.g., \(\mathbb{N}\)). Circuits are “differentiable”, which means that polycirc can transform the circuit above into an “optic”, which is a circuit of this general shape:
Note that some wires are “bent backwards”: we use this to model the backwards information flow of the gradients of the forward circuit. However, this means that optics aren’t functions - polycirc’s learner module lets us turn these optics into a function we can use for gradient descent.
Building a circuit
The main thing that polycirc does is represent circuits. The docs describe how to build circuits, but in short, we can build our original example like this:
from polycirc import ir
= (ir.identity(1) @ ir.negate(1)) >> ir.mul(1)
circuit # Builds this circuit:
#
# x₀ ───────────┐ ┌───┐
# └───┤ │
# │ * ├──── x₀ * (-x₁)
# ┌───┐ ┌───┤ │
# x₁ ───┤ - ├───┘ └───┘
# └───┘
Circuits are built by combining smaller circuits with sequential >>
and
parallel @
composition.
Sequential composition f >> g
means “plug the outputs of f into the inputs of g”:
┌───┐ ┌───┐
───┤ f ├───┤ g ├───
└───┘ └───┘
While parallel composition f @ g
runs f
and g
in parallel:
┌───┐
───┤ f ├──
└───┘
┌───┐
───┤ g ├───
└───┘
Now we’ve built our circuit, we can compile it to python source code and see that it computes what we expect:
from polycirc.ast import diagram_to_ast
print(diagram_to_ast(circuit, 'multiply_negate'))
Which prints this python code:
def multiply_negate(x0, x1):
= - (x1)
x2 = x0 * x2
x3 return [x3]
Zero-Knowledge ML
Now that we can build and compile circuits, we’d also like to differentiate them to train machine learning models. For an end-to-end example of training a linear model, see this example; I’ll sketch the details here.
Let’s start by defining our model circuit, then turn it into an “optic” using
rdiff
1:
# mat_mul is like a single (linear) layer neural network
= ir.mat_mul(OUTPUT_SIZE, INPUT_SIZE)
model_circuit = rdiff(model_circuit) r_model
Remember that r_model
is an optic whose backwards information flow is the
gradients of the model, so we need to do two things:
- Use the gradients to update our model parameters
- Get a function which does a single step of gradient descent
The make_learner
function
does both of these steps for us:
u = fixed_gd(ONE >> 7)(NPARAM) # learning rate of 1/2^7
d = cliploss(OUTPUT_SIZE) # mean squared error loss
step_circuit = learner.make_learner(f, u, d, NPARAM, INPUT_SIZE)
Here, the u
and d
maps are the “update” and “displacement” maps, which serve
the same purpose as the “optimiser” and “loss” function.
We can then compile step_circuit
and use it for gradient descent:
step = diagram_to_function(step_circuit)
for i in range(0, NUM_ITER):
# parameter update step
p[i+1] = step(*p[i], *x[i], *y[i])[OUTPUT_SIZE:OUTPUT_SIZE+NPARAM]
A quick final note: step
actually returns more than just new parameters, which
is why we have to take a slice of its outputs.
Its type is actually step : p + x + y → y + p + x
: the first y
outputs are
actually the model’s prediction using the old parameters.
Summary
Hopefully that gave a quick sense of what polycirc is for. If you’re interested in using it, feel free to drop by the discord or ping me on twitter!