Nanoclaw
I’m running nanoclaw locally and it’s fun! I can see a lot of potential.
As soon as I deployed it, I realised there were some things that I really wanted to change. Here are some rough thoughts about how I changed it.
Initial design
The initial design for Whatsapp is a socket and response to messages. This works fine for basic cases.
I added transcription and local media download. But, the logic began to sprawl among many functions.
FSM
Moving to use finite state machines is a big gain in stability and predictibility. The whole socket and message lifecycle is now expressed as transitions.
The logic for each is focussed in smaller functions that have clearer boundaries.
FSMs all the way down
Going further, each of the tasks becomes its own FSM with a smaller interface.
Overall, FSMs are under-used in systems, especially ones that grow from LLMs.
State in the DB
There was a lot of ephemeral state scattered throughout the functions. These states are much better suited to DB state. This has the added benefit of more state introspection and easier restarts.
Move fast
Because I wanted to experiment with how it worked, I split apart Nanoclaw into parts.
There is a long-running Whatsapp socket. It just sends and receives messages. It does not need constant restarting.
The main Nanoclaw logic lives in a separate process that can be killed to pick new changes.
Local-first
For so many of my ideas I wanted to route to local LLMs first. This has been one of the harder challenges: to build the right schema for returning useful results.
Testing
There were some tests for the system, but not enough at the right level to give me confidence. The Whatsapp protocol is opaque (that’s by design; it’s not officially supported) so the only way to verify is to run an experiment.