Chapter 4: How Bots Work¶
The word "Bot" has significance.
When AI began spamming our chat rooms, we called them bots.
When they instead were moderators, and kicked spammers off the server, they were still bots.
When you were playing an online game and realized that your opponent was actually controlled by a program, he was a dirty, cheating bot.
When you wanted a relaxed game without the social obligation to play well, you played a bot game: bot opponents and even bot teammates.
When I developed battlefield robotics for the Army, we were building bots. "Why didn't this bot transmit its position?", "Load that bot into the truck."
It's a neutral term with with historical precedent. I think it aptly names the modern AI-enabled systems that can take leadership. Let's define the term.
Bot: A machine that functions like a person.
Not All Robots Are Bots¶
I was asked once1 whether a washing machine is a robot.
I said "yes," and defended it this way: It is a machine with sensors and actuators that adjusts its behavior to the world around it. I hold to that.
But a washing machine's processing of its inner world is not close enough to thinking that we would anthropomorphize them. And it's not an entity in our social order. Not a coworker, but a tool.
Bots fill roles similar to humans. They're there because they're useful in that position, and when they operate, it feels to us (in some way) like a person is doing it.
The Mind of a Bot¶
A Bot's World Model¶
Although a washing machine isn't a Bot, it can give us a simple example of a world model
graph TD
RW((Real World)) -- "Input (e.g., Temp)" --> Sensors[Sensors]
subgraph Machine [The Machine]
Sensors -- "Builds/Updates" --> WM[(World Model)]
WM -- "Decisions based ONLY<br/>on internal model" --> Action[Output Actions]
end
Action -- "Acts on World<br/>(e.g., Apply more heat, keep spinning)" --> RW
The takeaway is that the world model exists inside the machine. When the machine is responding to its environment, it determines what to do based entirely on its internal world model.
Building a Bot Mind¶
We can lead bots because we can talk with them like people. The technical breakthroughs that make me possible are already a few years old, but it's still heady to come to terms with. After decades of clever algorithms and trees that aimed to structure and recreate the language we have arrived at the promised land of good conversation. Productive conversation. Not even stilted "robot" dialog still portrayed in science fiction.
We lead bots with natural language because
- It's easy for us, for communication to and from the bot.
- Bots are smart enough to make sense of it.
Synthetic Neurons¶
I opened this book talking about a curious spirit in me that lay dormant for decades. That period of slumber maps to the "AI winter" of the 80s-2010ish.
I first studied Artificial Neural Networks, (A)NNs as a Senior in High School writing a paper on AI. (I'm not a sentimental person, but actually still have a printout of an article I sourced.) The concept struck me deeply because it had an elemental propriety to it; like a match between form and function: after all, the neuron is nature's thought-enabler...
But alas, NNs didn't seem that smart. We collectively decided that we didn't quite have it figured out and that NNs were just a cargo cult of cognition, bamboo runways reaching for nature's elegance, imitating the shape of something wondrous without grasping what made it fly. Bolstering the case for NNs being a dead end, there remain theories that you need glial cells, or quantum effects, or cytoskeletons where the magic really happens in our brain tissue. Enough headwinds that neural net study was largely neglected.
But a few breakthroughs later and the bio-inspired approach started demonstrating its original promise of machinery that thinks like we do. With neurons. So that's where we must begin our understanding.
Here's how a neuron is working in your brain right now
Here's how a neuron works in a neural network
In summary: Neurons are how our minds think. We tried to get machines to think by mimicking the structure. It didn't work for a while, and we mostly figured the mimicry is insufficient. After further development they are provably intelligent.
To condense further: we made machinery that looks like brainstuff and it's smart now.
Training: making adjustments to the neural network to make it smarter.
So understanding neural networks at the detail I've provided is not required for you to use them. But it's required for you to lead them. You have to reach some acceptance that they are "thoughtful" entities. They have thoughts.
Further, they are minds. Or rather, for the bots we lead, it's a structural primitive of a mind -- A fundamental object that performs the required math to serve as a component in a machine mind.
The NN components enable learning and the trained neural network (or, for the bots that matter to us, the complex composition of networks) is the model.
Technical Detour: Mindless AI
Before I expand on how modern bot minds are working, lets fill out a missing element.
Yes, you can have AI that doesn't use NN.
During the aforementioned AI winter there was a great deal of development in expert systems, natural language processing, behavior trees, and other methods that I will not get into. It's useful stuff for engineers, and indeed was the tech behind most of my own prior work that I've shared in this book, but it is not all that relevant for the kind of bots you will act with in a leadership role.
Besides the historical AI techniques there are modern techniques that are more tools than minds, and so are not conducive to leading. There are some gray areas, like writing prompts to generate images, but the experience of the leader is not sufficiently different to warrant a separate deep discussion of the underlying technologies.
It is difficult to conceptualize the sheer number of brain cells in our own skulls, and the same applies to a modern AI. There are typically billions, if not trillions, of connections between synthetic neurons. These connections, as we've discussed, have "weights" that collectively dictate the behavior of the entire system.
Adjusting these weights in phases is what we call "Training". In fact, one of the primary ways frontier AI labs set themselves apart is through their unique training processes and the massive hardware needed to execute them. I've shown a generalized process in the diagram below, with hexagon-shaped boxes in the diagram representing heavy processes.
graph TD
A[Raw Data] --> B{{Pre-training}}
B --> C[Base Model]
C --> D{{SFT / Fine-Tuning}}
D --> E[Instruct Model]
E --> F{{RLHF / Alignment}}
F --> G[Preference Feedback]
G --> F
F --> H[Safety & Guardrails]
H --> I[Final Product]
I'll draw your attention to a few elements in the diagram.
Raw Data : Pretty much all the facts and "worldly knowledge" the synthetic mind will hold. Note that there's a directional arrow pointing from Raw Data, meaning that after this point, the model ceases to absorb new worldly facts. The calendar date that we cross that threshold is called a "cutoff" ("knowledge cutoff", "cutoff date", etc.)
RLHF / Alignment : Forcefully redirects the behavior of the LLM. It is used to correct errors in logic, or (more often) to instill cultural values and human preferences. For example, we might banish uncomfortable conclusions derived from data, even after efforts were made to scrub such things from the Raw Data! Humans and, often, other LLMs carry out this step.
Safety / Guardrails : In most settings, there are behaviors the system has learned that you as LLM facilitator want to avoid. It could be breaking the law, or company policy (E.g., divulging proprietary data).
Final Product : The finished mind you chat with. Whenever you send a prompt and this generates a reply, you are running what engineers call inference.
As it stands today, and for the most part, LLMs are at the core of today's synthetic mind. It's anyone's guess what form or combination tomorrow's AI will take, but it's a safe bet that it will continue to evolve. To believe that we have reached the apex of machine intelligence is as falacious as believing that homo sapiens are the endpoint of biological evolution. We, like AI, have proven to be a winning design, but revisions are always underway, and higher fitness will reveal itself as it always has.
LLMs and the Transformer¶
LLM: Large Language Model. The "model" is the trained configuration of the neural network. "Large" means we threw a massive amount of computation at it. "Language" tells you what it processes.
The LLM is the mind you talk to, and the transformer is the mechanism behind that mind. It is a layered configuration of neural networks (aka perceptrons) and self-attention. This physical arrangement of chunked up neurons is analogous to the cortical structure in our own brains. It's the intermediate anatomical structure for today's thinking machinery.
graph TD
Input((Input)) --> Attn1[Self-Attention]
Attn1 --> NN1[Neural Network]
NN1 -. "Repeats many times" .-> AttnN[Self-Attention]
AttnN --> NNN[Neural Network]
NNN --> Output((Output))
Self-attention, by the way, is a process that evaluates the relationships between all the words in a text at the same time — which is great for hardware — instead of reading them sequentially. It was introduced by researchers working on the problem of translating between human languages (say, English and Japanese), who needed a way to get the deal properly with words (including stuff like bat for baseball and bat for flying while furry).
If you build a neural network in the shape of a transformer and train it on vast amounts of human writing, it stops just parroting words and starts demonstrating genuine reasoning.
I think this is one of the most remarkable revelations in all of science. To have a machine that gets words, we built a brain capable of mastering any concept that can be expressed in language. And there appears to be no outer wall for the domains into which this reasoning can stretch.
Suffice it to say that these technical frontiers, followed by their adoption in culture, is why we find ourselves in the AI Revolution today.
Context for LLM (prompts, files.)
With the ChatGPT moment, we had machines that we could talk to in a natural language, and they could hold on intelligent, reasonable conversation. That was a breakthrough for a lot of us, but massive advancement from there occurred pretty much under everyone's radar. At some point, the attitude was "yeah, it can search the web and stuff." From an ordinary user's persepctive, yes, it's not a huge deal that the knowledge no longer suffer from the "cutoff" (ie, the most recent data in the training set) or that face that only knowledge available in the training set could possibly be in the LLM's mind. But he fact that LLMs can do something besides have a conversation with you opens the world up to them, and is why LLMs (or something with similar capabilities) are positioned to be the mind of all Bots in the foreseeable future.
I'm going to give you a brief conversation. And it'll seem a little quaint and detached from technical matters, but it's an accurate model of how LLM-based bots are doing all these amazing things beyond talking. One actor in the conversation is Program, which is the thing that sort of manages the LLM.
Program : Hi, LLM.
LLM : Hi, Program.
Program : Here is some info about what you are.
LLM : Ok.
Program : You can surf the net. Just say "web search:" followed by your terms.
LLM : Ok.
That's it.
LLM produces the right sequence of tokens to let Program know that it wants to run a routine (in this case, searching the web).
That routine, which is technically called a tool, is executed by Program, whether it's something Program can do itself or maybe it'll call out to another program via an API. There is a specific term for this complete setup:
Agent : An AI with tools.
Tools
intermediate formats and other extended minds (part of context)
Guardrails, mental ruts, and sycophancy. agendas, marketing, and other ulterior motives. why truth is important and why it will remain imperiled.
Near Future Machine Minds¶
Fine-tuning exists, but proper training will probably be done more outside of AI labs soon.
There could be advantages to continuous training. (whilst avoiding catastrophic forgetting)
AGI & ASI
Bots are Useful¶
Bots fill roles people do or did
The Automation Equation¶
The great utility of Bots is to take over all or some of a task that used to require humans. That's Automation.
Whether you should automate something comes down to simple math. I call this the automation equation.2
\(C\) is the cost of putting the automation in place.
\(b\) is the benefit you get each time the automated system runs.
\(n\) is the number of times it runs.
If the left side is less than the right side, automate.
Imagine you own a bakery. A machine that automates making cake sheets costs $100 . Each cake it makes saves you $4 in labor—that's your \(b\). You expect 50 orders—that's your \(n\). Is \(100 < 4 \times 50\) ? Yes. You buy the machine.
You might say "That's too simple! What about maintenance, training, downtime?" Well, do account for those things. Fold them into \(C\) or subtract them from \(b\). The tool works best if you give elements their proper weight. But don't lose the forest for the trees. The fundamental question for deploying an automation is always: does the cost of building it justify the benefit of running it?
Now here's why this matters so much right now, and why Bots are increasingly the answer.
AI is pushing every parameter in the equation toward more automation.
\(C\) goes down. We have AI tools that build more capable systems more easily. What used to require a team of engineers and months of development can now be prototyped in an afternoon.
\(b\) goes up. Consider customer service. No one used to be satisfied talking to a robot. Now people are choosing to speak with LLMs. A smarter bot handles more situations per interaction, so the value per cycle increases.
\(n\) goes up. A smarter system can do more kinds of things. Each new capability adds cycles. Let's consider humanoid robots. Their human form factor means they slot into tasks that actual human bodies do. That's a lot of tasks! We'll give each task a separate \(b \times n\) term:
That's the economics of generalism. A single-purpose robot has one \(b \times n\) to justify its cost. A humanoid sums the value of every task it can perform. For a purely digital bot, a generalist AI, it's the same deal. The benefit side of the equation becomes massive.
Tasks that didn't make economic sense to automate last year probably do today. And the bots that carry out those tasks need someone in charge.
Bots get Anthropomorphized¶
could be because they do stuff people do
we're social. we want to work with people / cooperate
-
This was on stage at a robotics conference where I was presenting on robot aesthetics. I think the implication was, "Wouldn't you say my washing machine is a robot? Surely it doesn't need to look like a character robot." And I'd agree with that assertion, too. I was only discussing bots. ⤴
-
It's an inequality, but that's not catchy, especially for the title of a video; viewable here: https://youtu.be/Xf460Z_IQhM ⤴