At last I have gotten round to writing a blog post about the criminally underused machines library written by the terrifyingly productive Edward Kmett.
This is a very simple demonstration of usage, with a focus on machines using the IO monad. I will not cover how the library works (because I don’t know).
Let us begin with imports:
What’s a machine?
The docs speak:
Machines are demand driven input sources like pipes or conduits, but can support multiple inputs. You design a
Machineby writing a
Plan. You then
Here is a plan, which is
constructed into a machine:
It output “hello”, then “world”, and then it stops.
SourceT means it is a special case of ‘Machine’ with no inputs,
m says it has effects in the Monad
String says it outputs values of type String.
If it were a unix utility, running it would look like this:
fred@forte~> helloPlan hello world fred@forte~>
We will make it into a unix utility shortly, using…
A machine that takes String inputs, and prints each one it receives to the console. It never outputs anything.
Note its type:
ProcessTmeans it can both input and output. Any
SourceTis also a
ProcessT, accepting any kind of input.
MonadIO mmeans that side effects can be in IO
Stringis the input type.
()is the output type. This could also be any variable, but we use
()to indicate it has no meaning.
Note also the use of
repeatedly. This is an alternative behaviour to
construct, which builds a machine constantly repeating its plan: once stopped, it will always begin the plan again from the start.
We can compose this program into a machine that prints “hello world” to the console and outputs nothing. Imagine the
~> operator representing data flow.
Now we will enter the real world, and run our machine using
runT_, whose type is:
runT_ :: Monad m => MachineT m k b -> m ()
This runs a machine. It causes side effects to happen in the
m. It throws away the output of the
Machine. If you want the outputs, you can get them as a list using
Now, running our program, we see:
Let’s build a program. We have a faulty CSV file, and we want to find lines which have the wrong number of commas. Our file represents a list of people, ages, and jobs. Chopin has unfortunately too many jobs, and so is an invalid record. So too is Liu Yang, as her country is listed erroneously.
name,age,job Frederic Chopin,39,musician, composer Jane Smith,30,creative accountant Johann Sebastian Bach,65,composer Liu Yang,35,Astronaut,China
Our program will do the following:
- Read a CSV file from stdin
- Skip the header
- Find lines with the wrong number of commas (i.e. not 2)
- Print the count of bad lines
Starting in a new file, let’s import some things:
Our first machine reads lines from stdin
Let’s generalise this
ioSource: Produce values of type
a using a program
f until a monadic condition
k is fulfilled.
Now we can write
lineSource as simply
Skip the Header
Machine #2 must skip the first line. We don’t need to write it, it’s already in
So our skipHeader machine is simply:
Next up: count the number of commas in each line. For this we can utilise
BS.count, which counts the number of characters in a ByteString.
Note that instead of writing out this whole machine, we could just
BS.count function over the previous machine.
Filter & Count bad lines
Finally, we filter out all the counts not equal to 2, and then count them. Filtering we can do with library functions:
Counting is easy too: the only new thing to note is the use of
(<|>). That just means “if there’s nothing left to await, we’ll yield the current count and then stop”.
Putting it together
We already know how to compose machines, so it’s a simple task of extracting the count:
Huzzah! Now, we did things a little verbosely here and there but hopefully it has demonstrated how you can get started using
Machines. Hopefully a more real-world use case will be the subject of a future post.