Comments on the Friston “free energy” stuff that @slatestarscratchpad has been talking about:
Friston’s papers are badly written in a way that I find very recognizable. He’s not incomprehensible in some exciting, esoteric way, he’s just a certain type of bullshitter.
He does a thing which is common to both many physics crackpots and (alas) quite a few real scientists: making a big deal out of his ability to relate his ideas to various famous and positive-affect-haloed ( “canonical,” “celebrated” etc.) equations/theorems/subject-areas. The problem is, you can always do this, because most ideas in math and physics are connected, sometimes for ~deep~ reasons but usually for ordinary, obvious and not very interesting reasons.
An (exaggerated) non-mathematical version of this would be to breathlessly exclaim that your theory applies simultaneously to Europe and France, due to the (“time-honored”? “celebrated”? pick your adjective) fact that France is a part of Europe. If you’re like Friston you can go the extra mile and claim that your theory has unified Europe and France.
For example, as far as I can tell, the entire section called “Optimal control theory and game theory” in Friston (2010) boils down to the observation that if you use prior probability as your utility function, then all of the usual results about how to make decisions with a utility function (optimal control) apply as usual, where the utility you are optimizing is prior probability. Friston’s proposal has no more (or less) connection to optimal control theory than any proposal about which utility function to use.
More substantially, insofar as I understand the non-vacuous part of Friston’s proposal, it strikes me as a non-starter.
Here is what I think he is proposing. First, he has this novel theory of motor planning which says something like the following: at every moment, we have some prior belief about what we expect to be happening, and we also have sensory input about what is really happening, and our muscles decide how to move by using the rule, “move to decrease the gap between prior belief and reality.”
And in some sense, that is also the rule followed by perception: immediate sensory input is ambiguous and can be parsed equally well into more than one possible “reality,” and we break this tie by choosing to perceive the reality that most closely matches our prior beliefs. So Friston wants to say that perception and motor planning (really, perception and action in general) are executing the same exact rule, “minimize the gap between prior belief and reality.” (He calls this gap “free energy.”)
Before I go into the problems with this, here is a concrete example where I can see the appeal of this perspective. Suppose I’m walking along, and I suddenly hit an unseen bump in the road and trip on it. I lurch forward – now my head is pointing downward, I’m seeing the ground, my body is at maybe a 45° angle instead of a 90° angle. My visual and proprioceptive perceptions immediately shift to account for this: I know that I’m seeing the ground and that I’m in this new position. At the same time, though, some very quick reflexive muscle movements are taking place, trying to bring me back to my original upright position, which is where I predicted I would be.
In this example, perception and action are both – I guess – trying to close the gap between prior belief and reality. Perception tries to close the gap by proposing a “conservative” interpretation of the sudden shift: lots of sensory input has changed, but only because of a slight shift in body position (while the rest of my prior beliefs remain true), not because I’ve been suddenly teleported to Mars or something. Action tries to close the gap by making my sensory input closer to predicted sensory input: faced with a sudden acceleration of my head (signaled to me by my inner ear), it reacts by accelerating my head in the opposite direction, and so forth.
There are various things that I’ve finessed here, most importantly the distinction between predictions about sensory input and prior beliefs about reality. Friston makes this distinction, but claims that these two terms should be bundled together in a single expression (“free energy”) which is optimized by perception (as it alters perceived reality, keeping sensory input fixed) and by action (as it tries to alter sensory input, keeping perceived reality fixed).
I am a bit worried that this is a trick you could use to “unify” any two unrelated processes, but I can see the intuitive justification: given some sensory input and some perceived reality, there is some single quantity expressing how “surprised” we are by this state of affairs, and in principle we can reduce that number in two ways, by altering perceived reality or by altering sensory input.
But here is the problem. We can alter perceived reality more-or-less instantaneously. But we can’t instantaneously change sensory input. What we can do is send motor signals, and those set the rate of change of sensory input.
Revisit the unseen-bump-in-road example. My perceptions jump immediately to fit the sensory input perfectly. There isn’t some gradual process where my perceived position shifts from the predicted one (upright) to the actual one (lurched over) – or if there is, it happens in a tiny, tiny fraction of a second. But my motion to right myself occurs perceptibly over time.
Indeed, this separation of timescales seems like a good thing. If perception could only make sense of new inputs as fast as action could change those inputs, then the two would be stuck chasing each other in an endless loop. First I pitch forward (in real life). Then, over the course of (let’s say) two seconds, my perception slowly alters my world model so that it represents me as pitched forward, so that this input is no longer surprising. But in those same two seconds, my actions have responded by making the input unsurprising in their way – that is, by bringing me upright. Now, my brain notices that sensory input is surprising again: my world model says I’m pitched over, but input says I’m upright. So perception starts to adjust my world model towards “I’m upright,” over the course of two seconds … in which time, action has closed the gap by making me pitch over again. I’m now in the same place I started: world model upright, but input says I’m pitched over. And so I bob up and down in place, endlessly, like one of those bird toys you put on your desk.
One engineering solution to this problem is to have perception adjust the world model so fast that action can’t keep up. (It seems like this is generally the case.) Another is to send a copy of the motor instructions back to perception so it can “subtract them out” when deciding whether it needs to shift anything to reduce surprise. I think this also happens; at least, I remember learning that the nervous systems of (some) electric fish do this, so they don’t sense the electric field changes they themselves produce (or plan to produce). But this ruins the picture where actions are directed to decrease surprise, since it subtracts action effects out of the surprise signal.
You can actually see this problem right there in Friston’s equations:

There are a lot of variables here, but to get a basic sense: “s” is sensory input, “mu” is perceived reality, “a” is action, and “x” and “theta” are the external world state. (Elsewhere he writes that x and some other things [including a non-curly-script theta, not shown here] are a subset of theta, which makes no sense to me since he writes things as functions of theta and x; who fucking knows. The tildes above x and s appear to indicate that they are functions of time, even though everything here is a function of time.)
Anyway, if you squint at the green “External states” box on the left, you can see that there’s a tiny dot above the x, meaning “the time derivative of x.” This makes sense: the right-hand side includes action, and our actions can’t immediately change sensory input to whatever we desire, they can just move it over time in some direction.
But there’s no such time dependence in the “Internal states” box. Here, perceived reality mu is updated instantaneously at all times to track sensory input s. This is the meaning of the “arg max” in the lower right box.
But we also have an “arg max” to compute action in the lower left box. What’s up with that? If we look at the line above, we see that we’re maximizing a quantity that depends on “a” via “s(a)”. That is, sensory input (s) is supposed to be a function of action (a), so that we can set it to whatever we want instantaneously by choosing the right action.
But, as I just said a moment ago, it isn’t. Again, we can see this by following the arrows to the left and up: s = g(x, theta) + w, and the only term on the right affected by “a” is “x,” and “a” doesn’t set “x” directly, only its time derivative.
tl;dr: Friston is full of shit.
