Adversarial free will hypothesis

Sep 06, 2021

Epistemic status: another highly-speculative piece on a highly-speculative topic.

One common way modern people often think about their self awareness is a bit like this. We tend to acknowledge that there’s a lot of subconscious processing happening in ourselves that we’re not aware of and cannot influence (a basic example would be some bodily function like pumping blood), but then there’s this self-aware part that can make conscious decisions and perform actions based on those decisions. Some who are familiar with a bit of neuroscience research concede some more ground to the subconscious part — they acknowledge that a lot of what we would consider to be intellectual choice making actually happens unconsciously and we don’t have much control over it.

Here’s an interesting hypothesis to think about (and maybe trying to find a way to falsify). What if the self-aware part of you has exactly zero influence on decision making and action taking? What if the self-aware part of you only has a read-only access to all (or some) of the sensors available to the actual decision-producing and action-taking part, and its job is to try and predict what the agent is going to do (as opposed to telling it what to do)?

In the interest of clarity, I’ll call the self-aware part of the agent “monkey” and the subconscious part “elephant”. You can probably see why from the accompanying picture. Monkey and elephant both perceive the same environment, yet they don’t speak the same language, and monkey does not have any influence on where the elephant goes and what it does. That said, the analogy is not perfect — monkey has some influence, it does not identify as the elephant-monkey combination, it can hop onto other elephants or trees, etc. But for the sake of clarity, I need to assign names to these things, and so here we are.

Let’s get back to the hypothesis. Monkey and elephant both receive the same inputs. In addition, monkey can feel the muscles shift on the elephant’s back, and so can reasonably predict what the elephant might do. Monkey is getting so good at predicting that it now fully associates with being the elephant-monkey combination (it has never been anywhere else but on the elephant’s back, it feels the elephant’s pain, etc). Turns out, this is the best way to predict what the elephant will do, even though monkey cannot exert any control over the elephant. Of course, elephant’s actions are constantly evolving in an ever-changing environment of the world, and so monkey’s predictions are never perfect. But most of the time they are ridiculously close, even though the monkey is playing the demo.

The “free will” part of the hypothesis is that feeling you get – I could do this or I could do that. Or when upon reflecting on your past actions, you feel like you could have acted otherwise. You couldn’t have – it’s just the monkey providing probability distributions for different sort of possible actions the elephant could have taken, according to its model.

This idea is kind of like a GAN neural network (which is why I’m calling this the adversarial free will hypothesis) in a sense that the monkey is like the student network, and the elephant is like a teacher network. The teacher network goes about its business doing things, and the student network is trying to predict what it’s going to do at every moment. The difference is that because of the ever changing environment of the world, the teacher network and the student network never converge — there’s always something new to learn from the teacher.

Here are some of the things this hypothesis puts into a somewhat different light:

The lack of consciousness in young children (under 18 months) — their monkey has not yet figured out that associating itself with the monkey-elephant ensemble is the best way to predict elephant’s actions.
The fact that you can, at a mere request, explain your actions — even the unconscious ones. Even when your two brain halves are disconnected, and one half of the body is doing something the other part has no idea about. You can do that because that’s the whole job of the monkey — to build as good of a model of the rest of the acting agent, so as to predict its actions.
The state of flow: when you’re at your limit at some skill, the monkey is paying extra attention and is not trying to predict much — it is observing and learning from the elephant.
On the opposite spectrum of flow — how some of your actions are totally automated: like a morning commute/coffee routine/etc. The monkey knows that these parts just never change, and so does not care to pay attention to these situations because they are too predictable. When the monkey does not pay attention, you run on autopilot.
Addictions: the monkey knows it’s bad, but the elephant still does it. The monkey still finds an excuse for the elephant.

Some other interesting questions:

Can it be that if we lived forever our monkeys would get to converge onto the elephant’s actions and stop paying attention, making us live on autopilot until the next novel experience (which would become less and less frequent)? Is it one of the functions of forgetting — to make sure we don’t turn into mindless philosophical zombies?
Is meditation essentially making sure the monkey keeps paying attention even at the most boring and predictable tasks, to fine-tune its prediction capabilities?

What’s the point of having the monkey? Why did this prediction mechanism develop? What is the benefit of having it? The evolution seemed to have developed it in a lot of species so there have to be benefits to an extra neural network expending energy in an organism. Otherwise why have it?

One speculation could be that the monkey is a general-purpose prediction mechanism. Its actual task is to build accurate models of the objects in the world, and the fact that its model of the elephant is so good is just an artefact that it spends its whole time on the elephant’s back. The elephant itself uses the prediction machine often as well (which is another place where this analogy is stretched too thin) – e.g. it asks: what is this thing I’m looking at likely to do?

The hard question of consciousness still remains though – why does it feel like anything to be the monkey?

Amateur

Discussion about this post