Making an Onchain Organism
I came across an AI x Crypto hackathon held by ETHGlobal called Agentic Ethereum. The goal was to build AI agents that interacted with blockchains. I prefer doing in-person hackathons but thought that participating in this online one was a good opportunity to build something with AI. When I applied (around three weeks before the start) I felt like I had good ideas, and of course when it started I blanked a bit realizing that I couldn't find a single useful use case for an AI agent (let alone one that I could build in a week), with or without blockchain integration.
I decided to focus on something I would find fun and probably not useful, but that would let me learn and hack away for a few days. Then I remembered about this kind of sci-fi idea of having autonomous organisms - with their personalities, properties and traits - living in their universes according to their universe's rules. Then it made sense, their universe could be onchain. When you think about it, a (decentralized) blockchain is really only useful for one thing: letting programs run with no interruption (be it technical or censorship). There's an argument to be made that a smart contract (a program running onchain) is the first kind of property that can truly be owned by no one. What better environment then to create an organism that will evolve by itself according to its artificial genes? Once deployed, nothing can stop it and it has to adapt a way to survive.
With maybe the possibility of ending with up a self-sustaining simulation of a bunch of virtual beings interacting with each others.
The Project
Given that I spent the first few days of the hackathon actually trying to find an idea (and to be honest forgetting that I was participating), I only started working on the project mid-way through the coding week. My goal was simple and divided in three parts: the agent (brain), the smart contract (body) and the frontend (to interact with the organism). For the "brain" I wanted to use an AI that would make the organism look alive by updating the smart contract unpredictably; I settled on a reinforcement learning (RL) agent whose objective was to keep "itself" (the organism) alive by making decisions on how to update its body. I chose RL because it was the perfect fit to maximize a "survival" score and from the outside would make the organism seem unpredictable.
The smart contract is a simple contract holding three things:
- properties: what is affected by the environment
- e.g. health (I used homeoscore in my demo to highlight that the organism's goal is to stay in balance)
- rules: how the environment affects the properties
- e.g. ETH price drops have a negative effect on the organism
- e.g. users feeding the organism have a positive effect on it
- traits: regulating how much the rules affect the properties
- e.g. market sensitivity is how much the price drop rule affects the organism
- e.g. feed sensitivity is how much food affects the organism
These three concepts together define our onchain living organism and how it reacts to its environment (the chain itself).
The Agent
The AI agent is a Python script which has the power to update the traits. The agent is very similar to a Reinforcement Learning (RL) agent in that it will observe how the market and feeding affect the homeoscore and will update (at a certain frequency, like every hour) the traits to keep the organism in a healthy range of 40-60. Below 40 we consider the organism tired while above 60 we consider it overfed, both are bad and command a negative reward for the RL agent.
We can make the hypothesis that the RL agent, over time, will know when to increase or decrease the two traits (market and feed sensitivities). The only difference between our agent and a normal RL agent is that we are not iterating over a lot of episodes quickly, instead each episode is a real-time interval happening onchain. The goal of that is to emulate organic life onchain.
On the technical side, our off-chain RL agent continuously observes the organism's discretized homeoscore/health, selects traits adjustments (for market and feed sensitivities) using a simple epsilon-greedy Q-learning policy, and updates its Q-values based on a reward function that measures deviation from the healthy range, thereby learning an adaptive strategy to maximize long-term survival under unpredictable (price drops and feeding actions) onchain conditions.
Why
I've always been fascinated by virtual worlds and simulations. I love the idea of building new universes with their own customs to experiment with and test designs that could be useful in our own world, but also the idea of exploring new universes, not unlike past explorers who adventured across the planet. This kind of exploration doesn't really exist anymore but technology has come to a point where we can generate our own virtual reality and go on exploring it to maybe find emergent behavior as well as understand new things about our actual reality.
For now I am working on deploying a first onchain organism with a set of rules and traits that would make it interesting to watch evolve. I'd love to see some unexpected behavior and maybe add game-like interactions.