Einstein was Right All Along

General relativity has a reputation for being very difficult. I think the reason is that it is very difficult.
Leonard Susskind, General Relativity: The Theoretical Minimum

Two events have left me obsessed with the concept of gravity. Such an obsession is fundamentally unhealthy, I’ll admit it. I will nonetheless explain it to you.

First, let us back up a little. While I have an educational background in science and engineering, higher-level math was never my strength. Thus while Physics 101 was both interesting and fairly easy for me, when it came to further courses in Physics, my sailing was not quite so smooth. Electromagnetism, quantum mechanics, relativity, and light all left the realm where my own intuition could no longer help me along and where my tenuous grasp on matrix math and advanced calculus came around to kick me in the tush.

Despite both coursework and a subsequent lifetime spent reading, through which I’ve sought to understand all of these concepts, my self evaluation is that my understanding remains fairly superficial.

The two events that sent me down the path to insanity took place in 2019. The first was that I viewed the TV show The Expanse, which had become available via streaming. The show’s attempt to get right certain aspects of space travel made it stand out from its science fiction peers and got me thinking about the details. Around about the same time, I read the book The Perfect Theory, which helped me cogitate upon some of these same issues.

It took a few years, but these thoughts ultimately began to consume me. I found an old social media post, from the fall of 2021, where I described waking up at night convinced that I could do the requisite math in my head (and, therefore, must before going back to sleep) to solve complex calculations for orbital mechanics. By 2022, I had mulled on this enough to share with my readers here (in another post).

That now-three-year-old post was framed as a question but I had already begun working on creating some answers. The problem was I could not develop for myself, based on intuition, any answers… and no wonder! From the beginning of his conception of General Relativity, Einstein himself struggled to define experiments that could distinguish between “real” gravity (caused by proximity to a very large, massive object) and “apparent” gravity, caused by an externally-driven acceleration.

Let me catch you up with what has happened, inside my own head, since then.

Once again, a science fiction series has driven me to move on this. Since that last post, I had the opportunity to finish the First Formic War trilogy (now almost 10 years old) by Orson Scott Card. In this set of books, there is extensive use of anti-gravity technology to drive several major plot arcs. The way that technology is used strikes me as wrong; both from the engineering and the physics standpoints. Because I’ve spent so much energy dwelling on this, the wrongness is almost painful to me.

I’ll contrast this to the original Ender’s Game. Like many science fiction works that preceded it, Ender’s world sees mankind having developed some control of gravity. The book (and if you haven’t read it, maybe you should) takes place on a space station where the future officers in a space defense force are trained. Ender notes that, in some cases, the gravity on the station defies physical logic but no explanation is given. To a reader, this is satisfying. Maybe the inter-species space warfare imparted some knowledge upon mankind or maybe it is a result of many decades of technological advancement. The fact that gravity and anti-gravity are common in space sci-fi books and, especially, films makes it easy to accept.

The Formic Wars (starting with the novel Earth Unaware) takes place long before Ender’s Game. In it, we find out that the development of gravitation “lensing” will be an entirely human achievement. Through judicious manipulation of the gravity field in the vicinity of an aircraft, the military will develop a rotorless helicopter with near-infinite lift capacity. At the same time, a private company has developed a gravity laser that can disrupt a local gravitation field so as to eliminate the integrity of solid masses.

The flaw in this has to do with balance. There are concepts such as conservation of energy that can be calculated and could, I suspect, throw a big monkey wrench into this concept. It also attempts a technical explanation without considering the equations for the fields that are being disrupted. Some not-so-simple math, one might think, would explain to me what is so troublesome about this as a near-future technology. Unfortunately, one of my problems, apparent for several years now, is that lack of the required mathematical ability.

Non-fiction, scientific books on the subject of gravitation seem to either either with “imagine you’re floating in space” or “here is the equation for a Lorentzian manifold; you should be thoroughly familiar with this math from your prior course work.” What I really needed was something in between. The good news is that I found that something.

The book General Relativity: The Theoretical Minimum by Leonard Susskind provides a reasonably happy medium between those two extremes. It starts its readers off with a minimum of assumptions about their mathematical background. It suggests that the reader should already have gone through some of the earlier books in his series but I found that, for the most part, that wasn’t necessary. A basic university-level understanding of math (even if decades out of use) plus a couple of internet searches seemed more than sufficient to follow along.

The book goes on to illustrate how a physicist might manipulate advanced math concepts using the shorthand representation for vector and tensor equations. Doing so allows one to evaluate the implications of such equations without actually having to solve them. Susskind does, in fact, encourage the reader to work through the math by hand (for simpler solutions) to aid in understanding but I did not do so. I found I could accept the assertions of the author without having to prove them for myself and was comfortable following along with the abstract notions.

Realize, too, that I was reading this book at night, in bed. The last thing I wanted to do was to get up and start doing lengthy math computations when I really should have been sleeping.

The second big impression General Relativity made on me was about black holes. In particular, it answered the question as to why there is so much focus on black holes in nearly everything I’ve read so far.

Indeed, this book does focus on black holes. Heavily. It introduces early a discussion about “falling into a black hole” with then a promise that you can come back to that after establishing a foundation. The second half of the book (more than half, really) is then dedicated to further analysis of black holes and their effects. Surely there are far more interesting things in this universe than the (admittedly very interesting) black holes?

The answer has to do with the math. One way to untangle the incredible complexity is to find situations where the terms of these equations go to zero. Doing so will allow the solutions to be extracted from otherwise unsolvable differential equations. One way to usefully zero out terms is to imagine the space/time in the vicinity of a mass approaching the infinite and a size approximating a dimensionless point.

Further investigation into the nature of this mathematical “singularity” began producing interesting properties. For example, the mass need not be infinite – just very large, and the size is not “zero¹.” This prompted astrophysicists to speculate on how such an entity might come into being.

In 1939, Einstein himself published a paper, “On a Stationary System with Spherical Symmetry Consisting of Many Gravitating Masses,” attempting to use General Relativity to prove that black holes were a physical impossibility. Within months, Robert Oppenheimer published “On Continued Gravitational Contraction,” using general relativity to prove that they should exist. It would take until 1972 for the effects of a presumed black hole (Cygnus X-1) to be observed, thus proving that the mathematical slight-of-hand, in fact, corresponded to a physical thing.

To me, it is quite remarkable that this “imagine if…” turned out to be something that is real and not particularly uncommon. That, combined with the easy math, goes a long way towards explaining why a book like General Relativity expends so much of itself on Black Holes rather than just General Relativity in general.

So, contrary to my title, we learn that Einstein was initially wrong about Black Holes. In “general” (oh I slay myself), though, his is a story of being right when those around him thought otherwise. When his peers felt that his descriptions of the physical universe were just too strange to be true.

This brings me around to that frequent subject of this site and my ponderings upon the true nature of gravity. If my feel for the state of the science, as rough as that is, is correct – this remains even today a subject for debate.

Specifically, I am thinking about Einstein’s assertion that there is no difference between an object in free fall within a gravitational field and an object outside of any effects of gravitation. In this understanding, it must be the shape of time and space that causes the apparent application of forces rather than the balance between gravity, acceleration, and momentum that makes the Newtonian understanding of gravity.

The alternative to Einstein’s explanation is that gravity is, truly, a “force field” similar to an electromagnetic field. That when an object in orbit which sees its forces balance out to zero, as Newtonian physics would explain, that this doesn’t mean those forces don’t exists. It is instead a convenient mathematical outcome. It’s a fine distinction but, in this this conception, we must consider Einstein’s version something other than the fundamental nature of the universe, as he proposed it. Instead, that fundamental nature has something to do with gravitons or unified fields – things we don’t yet fully understand – and Einstein’s thought experiments are just a nice way to get our minds around this deeper reality.

Dr. Susskind is in this second camp. Similar to my own line of thinking (if I may be so bold as to put myself in a league with a Stanford theoretical physics professor), he has gone further and proposed a thought experiment that would allow one to distinguish between these two concepts. His experiment involves a 2,000 mile tall man.

Imagine, he suggests, an astronaut who is 2,000 miles in height. Our big fella decides to don a spacesuit and engage in some outer space sky-diving, dropping back towards home from an orbital height. Imitating an atmosphere-limited jumper, his body is aligned flat with the earth’s surface, perpendicular to the direction in which he is falling. Encountering a sudden sense of dislocation combined with a momentary blindness, he finds himself unsure as to whether he is floating in the vastness of space or freefalling towards his home planet – two states which Einstein says are identical. But for our very tall spacefarer, they are not. If at some orbital height near the earth, he will be able to determine that “down” has different directions for his head versus his toes. More technically speaking, he will sense some compression as the two ends of his massive body move towards the earth following converging trajectories. If he were all alone in deep space, by contrast, he would feel no such sensation.

While the 2,000 miles of manflesh is necessary to make the differential significant, one might imagine that ability to detect the phenomenon exists (Planck’s insights notwithstanding) for a more reasonably sized body. It should be possible to design an experiment to measure the effect and, proof in hand, show that the nature of gravity is a force field and that Einstein’s thought experiments fail when one strays outside their applicability.

For my own peace of mind, I can’t fully accept Dr. Susskind’s reasoning and I am further comforted that there are others, far better educated than I, who feel similarly. My gut tells me that his 2,000 mile mind experiment glosses over certain realities. When he imagines his man in orbit, isn’t he, at least in some ways, imagining inhabiting the world of Newtonian physics? Might he be neglecting some reality of space-time curvature? Might he be ignoring the difference between a point mass and a distributed mass and the limitations of communication across distances. Can a 2,000 mile tall man even evaluate simultaneity properly so as to determine that he is both in free fall and under a gravitational-induced stress? I’ll even offer (without details) how this thought experiment has some common features with “ladder paradox” in special relativity. Might have similar resolution?

Part of me wants to believe that there it is possible to gain an intuitive understanding of General Relativity; one that does not need to depend on advanced math. General Relativity: The Theoretical Minimum gets me a little bit closer to bringing it all together in a way that, as a mildly committed amateur, enables me to comprehend the nature of the universe. If I found any issues with Dr. Susskind’s presentation, these were minor relative to the value that this book provided to me.

As to Einstein, I previously highlighted the date on which he published his paper on General Relativity. Yet it was on THIS day, November 18th, that he made his discovery. He was looking at a problem with Mercury’s orbit and the fact, since 1859, its perihelion (the point where the planet is closest to the sun) advanced 43 inches² per century more than physics predicted. While physical explanations were considered (e.g. an invisible moon around Mercury), an alternative was that Newton’s law of gravitation was wrong.

Since 1911, Einstein has realized that empirical validation of this theories would have to come through astronomical observations. In the case of Mercury’s orbit and Newton’s inverse square law, this ε, the discrepancy or error factor, provided an opportunity to put his theories to a test. Applying his theory of gravitation to Mercury’s orbit produced, exactly, the measured orbital advance of 43 inches per century with no additional, unknown factors required. The computations were finalized on November 18th, 1915 and this discovery informed the paper that he submitted on the 25th.

It is a spheres whose dimensions are described by the Schwarzschild radius. The book explains what that means. I will not. ↩︎
The calculation in 1859 was an error of 38″ per century. By 1882, the number was corrected to 42″ per century. ↩︎

Attractive Theory?

I can’t believe that it has been more than a year since I’ve written anything here. I’d blame the lingering effects of the Great Pandemic, except that doesn’t make any sense at all.

Even today, all I have to offer you is something partial and inconclusive. It is just a question, really. But to ask it I need to think about some other questions and so set the stage for my final, wild conjecture.

Great Questions

What is the nature of gravity? What is it that causes massive objects, with no physical interconnection, to exert forces upon each other over such vast distances?

I’m no expert, and as such I have no problem mixing current and viable theories together with ones that make no sense. So what might be the mechanism through which gravity exerts its pull? Is there a gravitational ether through which an object’s mass can influence other bodies? How about gravitons, cycling back and forth between (astronomically speaking) nearby entities? Are we simply missing something important in a unified field theory that would explain the nature of a gravitational field? Is there even any such thing as gravity in the first place?

Einstein’s General Relativity explains that gravity does not, in fact, exist. The acceleration that we feel, standing here on Earth every day, is not due to the Earth’s downward pull upon us, but rather due to the fact that we are standing in an accelerating frame of reference. His demonstration of the truth of this is – if we were in, instead, an accelerating reference frame in empty space, we would experience exactly the same environment. This is called the “equivalence principle.”

To put it in simplified terms, the reality is that the earth and everything on it is accelerating upwards at (from where we sit) 9.8 m/s². So then why isn’t the earth and everything on it exploding outward into a million pieces? If that’s what you are imagining, you are trying to think of that acceleration as only a spatial phenomenon. Einstein’s explanation requires considering the integrated space-time in which the universe exists.

Simple Conceptualization

The short (very short) answer is that our non-inertial frame is accelerating in space-time even as the spatial coordinates, as centered on the earth, remain fixed. In terms of the governing equation, the spatial acceleration term is exactly balanced out by the space-time curvature term. We are accelerating in space so as to PREVENT us from accelerating* in space-time.

This conveniently solves the problem of the mechanism by which gravity affects objects through vast distances of spatial vacuum. If there is no gravity, there is no “force” holding an object within its stable orbit… it’s simply an object in motion that shall remain as such, as Newton might have said. It does so, however, by creating a new and equally-perplexing problem.

If an orbit is simply a straight line in space-time, this suggests that the presence of a large mass has “distorted” space-time itself. In this understanding, a large mass (or a small one for that matter) bends space all around it so as to create a “gravitational well.” But by what mechanism can a lump of matter distort the very nature of time and space and for vast distances all around it? Isn’t this an even harder “reality” to conceptualize than determining the physics of a gravitation force (which, absence of a unifying theory notwithstanding, doesn’t look all that different from an electromagnetic field)?

Which Came First?

My question, one that it has now taken me over 4,000 words to formulate, is quite simple. What if we have the cause and effect backward? Rather than matter distorting space and time, maybe it is the ripples in space and time, caused by the vast energies involved in the formation of the universe, that results in mass being exactly where and when it is.

To give an example, maybe our sun is at the exact shape, size, and position that it is because there happens to be, right there, a sun-sized distortion in the shape of the surrounding space. Sounds crazy, right? But how can you tell the difference? Did the gravitation hole that the sun sits in gather up the sun? Or did the sun make the hole? Chicken or egg? If it’s the former, though, no weird theories are needed to explain how the sun “bends” space… the space was already bent.

Throwing DARTS

For myself, it is interesting that, just as I was trying to articulate this concept and get it written down, I read of exactly the sort of experiment that might help dis/prove my point. NASA recently launched a spacecraft so as to collide with an asteroid. The experiment, called the Double Asteroid Redirection Test (DART), was designed to help evaluate our ability to divert astral objects from their course. What I read about DART had to do with that double asteroid thing and the measurement of experimental results.

NASA’s spacecraft struck the smaller of a pair of asteroids; the smaller being in orbit around the larger. The mission’s success was measured through the deviation of the orbit of that smaller, struck asteroid around the larger of the pair. In the first couple weeks of October, calculations confirmed an orbital shift similar to what was expected.

The fun part, as far as this post is concerned, came near the end of the article. It mentioned that we earthlings have more in our arsenal besides diverting a trajectory through a collision. These include, said the DART team, “shooting asteroids with ion beams” or using “a so-called gravity tractor.” This last is defined as “a spacecraft that looms near an asteroid and exerts gravitational pull on the space rock for an extended time.”

If there is no gravity, how does this work? It must require an altering of the “shape” of space/time in a non-trivial way for what, frankly, is a fairly modest expenditure of energy. Is there a way to tease out some of the reality of an answer to my “which came first” question? Maybe I can come back here (in less than a year, this time) with some thoughts about thought experiments.

*The straight line, inertial frame in which we wish to remain is “free fall.” Or to put it another way, an orbit. Let’s save the implications of that for another post.

Gentleman Johnny

On June 13th, 1777, “Gentleman Johnny” Burgoyne and Major General Guy Carleton inspected the forces of Great Britain assembled at Saint-Jean-sur-Richelieu and about to embark upon an invasion of the American colonies from Canada. The force consisted of approximately 7,000 soldiers and 130 artillery pieces and was to travel southward through New York, by water and through the wilderness, to meet up with a second force moving north from New York City. The act of capturing the Hudson River Valley and, in particular, the city of Albany, would divide New England from the mid-Atlantic colonies, facilitating the defeat of the rebel forces in detail.

Poor communication may have doomed the plan from the start. The army which Burgoyne counted on moving up from New York City, under the command of General Howe, was committed to an attack on Philadelphia, to be executed via a southern approach. Thus, when it needed to be available to move north, Howe’s army would be separated from the upstate New York theater not only by distance, but also by George Washington. Burgoyne did not receive this important information and set out on his expedition unaware of this flaw.

Nonetheless, Burgoyne began well enough. As he moved southward, the colonial forces were unaware of his intent and strength and friendly native forces screened his army from observation. He captured Crown Point without opposition and successfully besieged and occupied Fort Ticonderoga. Following these successes, he embarked on an overland route from Skenesboro (at the southern reaches of Lake Champlain) to Fort Edward, where the Hudson River makes its turn south. This decision seemed to have been taken so as to avoid moving back northward, a retrograde movement necessary to use Lake George’s waterway. It may well also indicate Burgoyne’s lack of appreciation for the upstate New York terrain and its potential to allow a smaller colonial force to impede his movements.

Live Free or Die

In order to deal with the enemy blocking his path, Burgoyne sent forth his allied Indian forces to engage and run off the colonials. Having done so, they proceeded to loot and pillage the scattering of colonial settlements in the area. This had the perverse effect of driving otherwise-neutral locals into the rebel camp. As the fighting portion of his army made the trek to Fort Edward rather rapidly and uneventfully, Burgoyne discovered he had two serious issues. First, he finally received communication from Howe informing him that the bulk of the New York army, the forces with whom Burgoyne was planning to rendezvous, were on their way by sea to south of Philadelphia. Second, the movement through the wilderness had delayed his supply train, unsuited as it was to movement through primal woodland.

Burgoyne’s solution was to again pause and to attempt to “live off the land” – requisitioning supplies and draft animals from the nearby settlers. Burgoyne also identified a supply depot at Bennington (Vermont) and directed a detachment to seize its bounty. What he didn’t know is that the settlers of Vermont had already successfully appealed to the government of New Hampshire for relief. New Hampshire’s General John Stark had, in less than a week’s time, assembled roughly 10% of New Hampshire’s fighting-age population to field a militia force of approximately 1,500.

When Burgoyne’s detachment arrived at Bennington, they found waiting for them a rebel militia more than twice their number. After some weather-induced delay, Stark’s force executed an envelopment of the British position, capturing many and killing his opposing commander. Meanwhile, reinforcements were arriving from both sides. The royal force arrived first and set upon the disarrayed colonial forces who were busy taking prisoners and gathering up supplies. As Stark’s forces neared collapse, the Green Mountain Boys, under the command of Seth Warner, arrived and shored up the lines. The bloody engagement continued until nightfall, after which the royalists fell back to their main force, abandoning all their artillery.

Stark’s dramatic victory had several effects. First, it provided a shot in the arm for American morale, once again showing that the American militia forces were capable of standing up to the regular armies of Europe (Germans, in this case). Second, it had an opposing dilatory impact on the Indian tribe’s morale, causing the large, native force that Burgoyne had used for screening purposes to abandon him. Third, it created a folklore that persists in northern New England to this day. Stark became a hero with his various pre- and post- battle utterances preserved for the ages. Not the least of these was from a letter penned well after independence. Stark regretted that his ill health would prevent him from attending a Battle of Bennington veterans’ gathering. He closed his apology with the phrase, “Live free or die: Death is not the worst of evils,” which has been adopted as the official motto of the State of New Hampshire.

Saratoga

The delays put in the path of Burgoyne’s march gave the Colonials time to organize an opposition. New England’s rebellion found itself in a complex political environment, pitting the shock at the loss of Ticonderoga against the unexpected victory at Bennington. The result was a call-to-arms of the colonial militias which were assembled into a force of some 6,000 in the vicinity of Saratoga, New York. General Horatio Gates was dispatched by the Continental Congress to take charge of this force, which he did as of August 19th. His personality clashed with some of the other colonial generals including, perhaps most significantly, Philip Schuyler. Among the politicians in Philadelphia, Schuyler had taken much of the blame for the loss of Ticonderoga. Some even whispered accusations of treason. Schuyler’s departure deprived Gates of the knowledge of the area he was preparing to defend, hindering his efforts. Burgoyne focused his will to the south and was determined to capture Albany before winter set in. Going all-in, he abandoned the defense of his supply lines leading back northward and advanced his army towards Albany.

On September 7th, Gates moved northwards to establish his defense. He occupied terrain known as Bemis Heights, which commanded the road southward to Albany, and began fortifying the position. By September 18th, skirmishers from Burgoyne’s advancing army began to meet up against those of the colonists.

Having scouted the rebel lines, Burgoyne’s plan was to turn the rebel left. That left wing was under the command of one of Washington’s stars, General Benedict Arnold. Arnold and Gates were ill-suited for each other leaving Arnold to seek allies from among Schuyler’s old command structure, thus provoking even further conflict. Arnold’s keen eye and aggressive personality saw the weakness of the American left and he realized how Burgoyne might exploit it. He proposed to Gates that they advance from their position on the heights and meet Burgoyne in the thickly-wooded terrain, a move that would give an advantage to the militia. Gates, on the other hand, felt his best option was to fight from the entrenchments that he had prepared. After much debate, Gates authorized Arnold to reconnoiter the forward position where he encountered, and ultimately halted, the British advance in an engagement at Freeman’s Farm.

Game Time

In my previous article, I talked about some new stuff I’d stumbled across in the realm of AI for chess. The reason I was stumbling around in the first place was a new game in which I’ve taken a keen interest. That game is Freeman’s Farm, from Worthington Games, and I find myself enamored with it. Unfortunately it is currently out of print (although they do appear to be gearing up for a second edition run).

How do I love thee? Let me count the ways. There are three different angles from which I view this game. In writing here today, I want to briefly address all three. To me, the game is interesting as a historical simulation, as a pastime/hobby, and as an exercise in “game theory.” These factors don’t always work in tandem, but I feel like they’ve all come together here – which is why I find myself so obsessed (particularly with the third aspect).

The downside for this game, piling on with some of the online reviews, is with its documentation. Even after two separate rule clarifications being put out by the developer, there remain ambiguities aplenty. The developer has explained that the manual was the product of the publisher and it seems like Worthington values brevity in their rule sets. In this case, Worthington may have taken it a bit too far. To me, though, this isn’t a deal-breaker. The combination of posted clarifications, online discussion, and the possibility of making house rules leave the game quite playable and one hopes that much improvement will be found in the second edition. Still, it is easy to see how a customer would be frustrated with a rule book that leaves so many questions unanswered.

Historical War Gaming

This product is part of the niche category that is historical wargaming. Games of this ilk invite very different measures of evaluation than other (and sometimes similar) board games. I suppose it goes without saying that a top metric for a historical wargame is how well it reflects the history. Does it accurately simulate the battle or war that is being portrayed? Does it effectively reproduce the command decisions that the generals and presidents may have made during the war in question? Alternatively, or maybe even more importantly, does it provide insight to the player about the event that is being modeled?

On this subject, I am not well placed to grade Freeman’s Farm. What I will say is that the designer created this game as an attempt to address these issues of realism and historicity. In his design notes, he explains how the game came about. He was writing a piece on historical games for which he was focusing on the Saratoga Campaign. As research, he studied “all the published games” addressing the topic, and found them to be lacking something(s).

I’ll not bother to restate the designer’s own words (which can be accessed directly via the publisher’s website). What is worth noting is that he has used a number of non-conventional mechanisms. The variable number of dice and the re-rolling option are not exactly unique, but they do tend to be associated with “wargame lite” designs or other “non-serious” games. Likewise, the heavy reliance on cards is a feature that does not cry out “simulation.” That said, I am not going to be too quick to judge. Probability curves are probability curves and all the different ways to roll the dice have their own pros and their own cons. Freeman’s Farm‘s method allows players to easily understand which factors are important but makes it very difficult to calculate exactly what are the optimal tactics. Compare and contrast, for example, to the gamey moves required to get into the next higher odds column on a traditional combat results table.

Playing for Fun

All the above aside, a game still needs to be playable and fun to succeed. We seem to be living through a renaissance in the board gaming world, at least among the hobbyists that have the requisite depth of appreciation. A vast number of well-designed and sophisticated board games are produced every year covering a huge expanse of themes. More importantly, ideas about what makes a game “fun” and “playable” have evolved such that today’s games are almost universally better than the games of a generation or two ago. Gone are the days (such as when I was growing up) when slapping a hot franchise onto a roll-the-dice-and-move-your-piece game was considered effective and successful game design. You can still buy such a thing, I suppose, but you can also indulge yourself with dozens and dozens of games based on design concepts invented and refined over the last couple of decades.

From this standpoint, Freeman’s Farm also seems to have hit the mark. It is a nice looking game using wooden blocks and a period-evoking, high quality game board. I’ve read some complaints on line about boxes having missing (or the wrong) pieces in them. This would definitely be a problem if you don’t notice it and try to play with, for example, the wrong mix of cards. The publisher seems to be responsive and quick to get people the material they need.

Game Theory

The real reason I’m writing about this now is because the game has a form that seems, at least to my eyes, to be a very interesting one from a theoretical standpoint. Contrast to, say, chess, and you’ll notice there are very few “spaces” on this game’s board. Furthermore, the movement of pieces between spaces is quite restricted. This all suggests to me that the model of the decision making in this game (e.g. a decision tree) would have a simplicity not typical for what we might call “serious” wargames.

Given the focus of that last post, I think this game would get some interesting results from the kind of modeling that AlphaZero used so successfully for chess. Of course, it is also wildly different from the games upon which that project has focused. The two most obvious deviations are that Freeman’s Farm uses extensive random elements (drawn cards, rolled dice) and that the game is not symmetric and non-alternating. My theory, here, is that the complex elements of the game will still adhere to the behavior of that core, and simple, tree. To make an analogy to industrial control, the game might just behave like a process with extensive noise obscuring its much-simpler function. If true, this is well within the strengths of the AI elements of Alpha Zero – namely the neural-net enhanced tree search.

Momentum and Morale

A key element to all three of these aspects revolves around rethinking how to model command and control. It is novel enough that I wrap up this piece by considering this mechanism in detail. In place of some tried-and-true method for representing command, this game uses blocks as a form of currency – blocks that players accumulate and then spent over the course of the game. Freeman’s Farm calls this momentum; a term helps illustrate its use. From a battlefield modeling and simulation standpoint, though, I’m not sure the term quite captures all that it does. Rather, the blocks are a sort of a catch-all for the elements that contribute to successful management of your army during the battle, looking at it from the perspective of a supreme commander. They are battlefield intelligence, they are focus and intent, and they are other phrases you’ve heard used to describe the art of command. Maybe the process of accumulating blocks represents getting inside your enemy’s OODA and the spending blocks is the exploitation of that advantage.

Most other elements in Freeman’s Farm only drain away as time goes by. For example, your units can lose strength and they can lose morale, but they can’t regain it (a special rule here and there aside). You have a finite number of activations, after which a unit is spent – done for the day, at least in an offensive capacity. Momentum, by contrast, builds as the game goes on – either to be saved up for a final push towards victory or dribbled out here and there to improve your game.

Now, I don’t want to go down a rabbit hole of trying to impose meaning where there probably isn’t any. What does it mean to decide that you don’t like where things are going and your going to sacrifice a bit of morale to achieve a better kill ratio? Although one can construct a narrative as to what that might mean (and maybe by doing so, you’ll enjoy the game more), that doesn’t mean it is all simulation. The point is, from a game standpoint, the designer has created a neat mechanism to engage the players in the process of rolling for combat results. It also allows a player to become increasingly invested in their game, even as it is taking away decisions they can make because their units have become engaged, weakened, and demoralized.

I’m going to want to come back and address this idea of modeling the game as a decision tree. How well can we apply either decision trees and/or neural networks to the evaluation of game play? Is this, indeed, a simple yet practical application of these techniques? Or does the game’s apparent simplicity obscure a far-more complex reality that prevents the application of these computer learning techniques by being applied by someone who doesn’t have Deep Mind/Google’s computing resources? Maybe I’ll be able to answer some of these questions for myself.

Chess Piece Face

A couple of months after DeepMind announced their achievement with the AlphaZero chess-playing algorithm, I wrote an entry giving my own thinking on the matter. To a large extent, I wrote of my own self-adjustment from the previous generation of neural network technology to the then cutting-edge techniques which were used to create (arguably) the world’s best chess player. Some of the concepts were new to me, and I was trying to understand them even as I explained them in my own words. Other things were left vague in the summary paper, leaving me to fill in the blanks with guesses about how the programming team might have gone about their work.

It took about another year, but the team eventually released a fully-detailed paper explaining the AlphaZero architecture, even as tech-savvy amateurs (albeit folks far more competent than I) dissected the available information. Notably, this includes the creation of open-source reimplementations of AlphaZero‘s proprietary technology. Now, more than three years on, there are many basic-level explanations, using diagrams and examples, explaining to non-experts what AlphaZero has done and how (for just one example, with diagram, see here).

I made a couple of serious mistakes in my previous analysis. Most importantly, I misunderstood the significance of AlphaZero‘s “general-purpose Monte-Carlo tree search (MCTS) algorithm.” At the time I first read that, I interpreted it as simply the ability to randomly-generate valid moves and thus to compile the list of possibilities which the neural network was to rank. An untrained neural net would rank its choices randomly and probably lose*, but could begin to learn how to win as better and better training sets were developed. That isn’t quite accurate. The better way to think about the AlphaZero solution is that it is fundamentally an MCTS algorithm, but one which uses a Neural Net as a short-cut to estimating parameters where they have yet to be calculated. I think I’ll want to come back to this later, but in the meantime an article using tic-tac-toe to explain MTCS might be educational.

My next error was in my understanding of convolutional networks. If you read my explanation, you might surmise that the source of my error was in my outdated understanding of neural network technology in general. I imagined the convolutional layer as being a set of small neural-networks. Further reading (a well illustrated example is here) about the use of convolution in image processing instructed me on the use of filters. This convolution strategy uses transformations that have been well defined through years of use for image manipulation and, in fact, often have familiar results to those of us who play with photographs through the likes of Photoshop or GIMP. That a filter which helps sharpen images for the human eye would also help a machine intelligence with edge detection makes sense. In the broader sense, it is a simple method of encapsulating concepts like proximity in a way that does not require a neural net training process to learn about it on its own.

I continue to believe that there is quite a bit of method, borne from extensive trial-and-error, in arranging the game state so that it makes sense, not only to the neural network, but also to those image processing filters. When the problem is identifying elements of an image, those real-life objects occupy distinct regions with the grid of pixels that make up that image. For the “concepts” about a chess board during a match, what is it that causes information to be co-located within a game-turn’s representation? It seems to me that the structure of the input arrays would have to be conceived while thinking that you are turning chess information into something that resembles a photograph. While the paper clearly states that the researchers DIDN’T find the results to depend upon data structure, it would seem this is a lot easier to get wrong than to get right.

I also note, assuming the illustration of AlphaZero‘s architecture in the above paper is correct, that the neural network is extraordinarily complex. At the same time, it seems arbitrarily selected in advance. The dimensions of the architecture are in nice round numbers – 256 convolution filters, 40 residual layers, etc. – which suggests to me one (or both) of two things. First, that these are probably borrowed architectural decisions drawn from long experience with image-processing neural networks. Second, that it may not have been necessary to “optimize” the architecture itself if the results were insensitive to the details. Of course, its also possible that when you’re drowning in excess computing power, you’d be more than happy to vastly oversize your neural network and then rely on the training process to trim the fat rather than invest the manpower to improve the architecture manually. Further, humans may make the right architectural decisions or the wrong ones. The network training will, at all times, attempt to make the objectively-best decision AND, perhaps more importantly, continually re-evaluate those decisions with each new run. For example, if you have enough computing power and training data, the fact that 252 out of those 256 filters are excluded every time may not be particularly concerning. Especially if, at some point, a new generation of the network decides that, after all, one of those excluded 252 really is important.

I update my previous entry on this today, not because I think I finally understand it all but, instead, because I don’t. The new bits of understanding that I’ve come by, or maybe just think I’ve come by, suggest new and interesting ways to solve problems other than chess. Hopefully, over the next few weeks, I’ll be able to follow through with this line of thought and continue to explore some more interesting, and far less speculative, thoughts on using machine learning for solutions that are less academic and more accessible to the non-Google-funded hobbyist.

*Except that the networks train against an equally incompetent opponent. Which, if you think about it, is a fundamental flaw with my approach. The network would have no way to bootstrap itself into the realm of mildly-competent players.

Pols/Polls

When I was just out of college, Sunday night was my time for laundry. After the sun went down, I would run the washer and dryer (we had machines in the apartment I was sharing) and then iron my dress shirts. At the time, I lived in the greater Los Angeles area (right on the Orange County/Los Angeles County line). As I ironed, I listened to Rodney on the ROQ and Loveline. The former I considered to be, as a self-appointed music connoisseur, an important part of keeping on top of the avant-garde of rock culture. The latter was a secret, guilty pleasure.

Loveline‘s format was live, call-in radio where teens and young adults could have their relationship questions answered. An up-to-date Dr. Ruth for Generation X. I remember, now, one particular type of call. The caller set forth on a long, rambling tale of various problems in his life. Sometimes the problems were mundane and sometimes a bit absurd. Such calls seemed to happen a couple times a month, so this isn’t a particular caller that I am remembering, but the pattern. The “funny man” host, at that time it was Riki Rachtman, would toy the caller, usually making vulgar jokes, especially if the story was particularly amusing. In some of the more serious narratives, the caller was left simply to spin his yarn. At some point, though, the “straight man” host – Drew Pinksy aka Dr. Drew – would cut the caller off and, seemingly unrelated to the story being told, ask “how often do you smoke pot every day?”

The caller would pause. Then, after a short but awkward silence, he would answer, “Not that much.”

“How much is not that much?” queries Dr. Drew.

“Oh, nine or ten times a day, I guess.”

The first time or two, it amazed me. It was like a magic trick. In a story that had absolutely nothing to do with drug use, or impairment, or anything implying marijuana, the Doctor would unerringly pull a substance abuse problem from out of the caller’s misfortune. The marijuana issues were obvious because they so often repeated, but Dr. Pinksy’s ability to diagnose serious issues from seemingly scant facts ranged more broadly. After much, much ironing, I realized that as important as our individuality and uniqueness is to our identities, we humans are dreadfully predictable creatures, especially when we come under stress. Few of us get the chance to see it and few of us would be willing to admit it, but you get us in the right context and our behavior becomes disturbingly predictable.

Eventually, my compensation improved and I invested the early returns into getting my work shirts professionally pressed. Rachtman would fall out with Loveline co-host Adam Carolla and he subsequently found a home in professional wrestling. Carolla and Pinsky took Loveline to MTV so that non-Angelenos could benefit from their wisdom. I never saw the television version of the show, nor listened to any version of it since they left KROQ. In fact, I don’t believe I’ve heard anything from Rachtman, Carolla, or Dr. Drew since they were all together on Sunday night on my stereo.

So why do I dredge up old memories now?

I read three articles, all in about a 12 hour period, and together they got me thinking.

Article #1 is from a political blogger. He predicts a solid Trump win come November. He cites data on absentee ballot receipts relative to demographics and early-but-dramatic deviations from the predictions. The patterns, he explains, show that Trump voters are more active than expected while Biden voters are less involved. Because of the correlation between early voting and support for the Democrats, this early data might prove to be decisive.

Article #2 is a column from Peggy Noonan in the Wall St. Journal. Overall, the article is a self-congratulatory piece where she sees her predictions for a Biden win coming to fruition. Noonan is a lifetime Republican (she was a speechwriter for Ronald Reagan) but she has been anti-Trump from the get-go. Once it became clear that Trump would be nominated by Republicans for a second term, her support has focused on Joe Biden. Alone among those vying for the Presidency, at least to her, he represented the politics that she was used to – before Bush-Gore, before the Tea Party, before Donald Trump. She predicted that the vast middle of the American political body would gravitate to the old and the known and she now sees that she was proven right. As evidence, she cites polling data among college-educated women. The data say that this demographic has shifted so dramatically against President Trump that the result will be not just a Biden win, but a Biden landslide. A one-sided massacre, the likes of which should be entirely impossible in this hopelessly divided nation.

#3 is about State races. The bottom half of the ticket gets mostly ignored by the media and yet, if you’re a voter in American elections, this is where you have your best chance to influence policy. For most of us, our vote for President is already cast, whether we’ve voted early or not. We live in States where the outcome has been known since before the conventions and so, whatever our individual preference, we see which way our electors’ votes are going to fall. Even in the “battleground states” each voter is but one check mark in a sea of millions. The odds that your vote could decide the outcome are astronomically small. Contrast that with the election of State Representatives. There, vote totals are in the thousands, not the millions, and elections can be decided by a handful of votes. Add in a little pre-election advocacy and the average citizen can have a real influence on the outcome. The lower house of State Government might seem like small potatoes compared to the U.S. Senate, but States do have power and small elections occasionally produce big outcomes.

Article #3 presented polling data on State House races, making it one of the few that has been or will be written. The polling outfits aren’t particularly interested in these low level races because the public doesn’t show much interest. Furthermore, the calculus is considerably different than that which drives the national races and, often, it takes a politically-savvy local person to understand the nuance. In the media’s defense, the biggest factor in many of these smaller races is what happens “at the top of the ticket.” Pro-Trump/anti-Trump sentiment is going to determine far more elections than the unique issues that impact Backwoods County in some smaller state. In fact, the data cited in this write-up was about the disparity in down-ballot voting between parties. Based on responses, Republicans look to be considerably less likely to vote for “their” State Representative candidates than Democrats.

The reason I saw this article was that it was being heavily criticized on social media – and criticized unfairly, in my opinion. First of all, in the macro sense, the article identified the top two predictors of State races, albeit obliquely (it was just poll data, no predictions of electoral results). What’s going to decide the close races at the State level is the relative turnout of pro- and anti- Trump voters, plus the motivation of those voters to consider all the other races that are on their ballots. However, the main complaint from the critics was of the polling methodology. The poll sample was just over 1000 respondents. How crazy is it, said the critics, trying to predict dozens and dozens of races from a poll which spoke to so few voters – maybe a handful from any given district containing thousands who will be voting next month?

It is this last bit, especially, made me think of Dr. Drew.

Before I get to that, though, let us ponder statistical methods for a second. When I first encountered some real-world applications of sampling and prediction, I was shocked with the rather small amount of collection that is necessary to model large amounts of data. If you know you have a standard distribution (bell curve), but you need to determine its parameters, you need only a handful of points to figure it all out. Another few points will raise your confidence in your predictions to very high levels. The key, of course, is to know what the right model is for your data. If your data are bell-curve like, but not strictly a standard distribution, your ability to predict is going to be lower and your margin of error is going to be higher, even after collecting many extra samples. If your data are not-at-all in a standard distribution, but you choose, anyway, to model it so (maybe not the worst idea, really), you might see large errors and decidedly wrong predictions. This is still all well understood. Much science and engineering has gone into designing processes such as the sampling of product for quality assurance purposes. We know how to minimize sampling and maximize production at a consistent quality.

But what about people? They are complex, unpredictable, and difficult to model, aren’t they? Can you really ask 1000 people what they think and use their responses to guess how millions of people are going to vote? Well, if you’re Dr. Drew, you’d know that people are a lot more predictable than we think we are. Behaviors tend to correlate and that allows a psychiatrist, a family physician, or maybe even a pollster to know what you are going to do before you do yourself. Furthermore, we are talking about aggregate outcomes here. I may have a hard time predicting whom you would vote for but, give me your Zip Code and I can probably get a pretty accurate model of how you plus all your neighbors will vote.

That model, the underlying assumptions that we make about the data, is the key to accuracy and even validity. Is my sample random enough? Should it be random, or should it match a demographic that corresponds to voter turnout? If the latter, how do I model voter turnout? The questions go on and on and help explain why polling is done by a handful of organizations with long experience at what they do. If you really, really understand the underlying data, though, a very small sample will very accurately predict the full outcome. Maybe I only have to talk to married, college-educated women, because I know that the variation in their preferences will determine the election. Maybe all I need is the Zip Codes from absentee ballot returns. Or maybe, after I produce poll-after-poll with a margin-of-error of a percent or two, I’ll wind up getting the election outcome spectacularly wrong.

This is a fascinating time for those in the business of polling. Almost nobody was even close when it came to predicting the 2016 Presidential Election. Some of that was the personal bias of those who do the polling. I’d like to think that, more often than not, it was bad modeling of voters which led to honest, albeit rather large, mistakes. Part of me would really, really like to see inside these models. Not, as one might imagine, so I could try to predict the election results myself. Rather, I’d like to see how the industry is dealing with their failure last time around and how have they adjusted the processes (amidst very little basis for doing so) to try to get this election right. When I see simultaneous touting of both a Trump landslide and a Biden landslide, I know that somebody has got to be wrong. Is anybody about to get it right? If they are, how are they doing it?

This is something I’d like to understand.

Case/Point

My writing isn’t always as clear as it could be. Fortunately, the news of the day has provided me with an illustration of what I’m talking about.

I’ve seen other writers address the same point that I was making in my previous post. Many of them have done it far better than I did – or could. I saw a graphic on-line that compared the “curve” for COVID-19 with a much bigger curve labeled Famine. Yesterday morning, the Wall St. Journal printed an editorial about why inflation might be a problem this time around (unlike 2008). This echoes, albeit distantly, the on-line dread of “inevitable” hyper-inflation that must result from our current massive infusion of borrowed/printed money. I’ve seen other on-line discussion analyzing the rate of closures among restaurants and projecting that out to estimate economic damage. I should welcome all these opinions as those of fellow travelers because the majority of the articles that I read are focused on how things are already turning around and how quickly the economy will recover. Yet I feel like even those with whom I agree the most with are missing the real issue at play here.

Some, any, or all of the economic red flags might be at play now, soon, or maybe not until later. Or maybe not. My point was that there is simply no way to know.

So, here’s what happened instead.

“But,” you must be thinking, “this has nothing to do with viruses or restaurant shutdowns.” It does, though. Everything we do as a society is interrelated and nothing happens in isolation. Would a police officer have killed a man in broad daylight, on camera, if it weren’t for the environment that “quarantine” and “lock-down” engendered? I have a hard time believing that this is the case.

Folks have torched Targets before*, and life goes on. But could it, this time, be the straw that breaks Target’s humped back when 28 stores shut down amidst government-imposed 50% capacity reductions, supply chain disruptions, and inflationary pressures?

It could be. Or maybe not. There is simply no way to know.

*OK. I don’t actually know whether Target has been the, eh, “target” of looting during past periods of civil unrest. A quick search doesn’t help me. In any case, I mean this metaphorically.

Lead/Lag

In control engineering, the concepts of stability and instability precede the design of the control algorithms themselves. As an engineering student, studying and understanding the behavior of dynamic systems is a prerequisite to controlling them. As elementary as it is as an engineering concept, it often seems to be beyond our collective comprehension as a society, even if it is well within our individual grasp. While most of us may scratch our heads when looking at a Nyquist Plot, we do understand that you’ve got to start backing off the accelerator even before you get up to your desired speed.

When I was a younger man, I had a roommate who drove a Datsun 280Z. This was a nifty little car. Price-wise, it was accessible to the young, the working class, without breaking the bank like. Not only was it affordable to buy but it was affordable to own. It was a pleasure to drive and, top it all off all, it was fairly impressive performance-wise. A nice example of the “roadster” style.

We took a few summer road trips together and, on more than one occasion, he asked me to drive so he could take a nap, sober up, or just monkey with his cassette deck. Before letting me drive for the first time, he gave me a bit of a run down on his car’s performance. “Don’t go over 90 mph,” he warned me, as the car would shake violently shortly after crossing the 90 mark on the speedometer. Needless to say, the car never shook itself apart while I was driving. Perhaps that was because, as a conscientious and responsible youth, I would never exceed the speed limit. Perhaps it was because he wasn’t entirely correct about the circumstances under which his car would have vibration issues.

My point is, most of us have a gut understanding of frequency response and stability and the struggles of controlling it. The seriousness of the problem is exposed in the design of mechanical systems and, in particular, those that incorporate high-frequency rotation of components. Deeper understanding and mathematical analyses are necessary prerequisites to assembling a piece of machinery that will hurtle through the night at speeds approaching 100 mph. In the case of my friend’s Datsun, as cyclic energy is induced into a system, it is possible for those inputs to resonate in the spring-like manifestations of the system’s passive structure. Without proper analysis and design, a vehicle’s suspension system might well start to exhibit extreme vibration at high speeds. The same applies to any dynamic system. We are all familiar with the violently-shaking washing machine, whether we have one in our own home or not.

Naturally, the mathematics apply to non-mechanical systems as well. Often the effects are far more serious than a shaking car or a jumping washing machine. In electric circuits, resonance can produce seemingly impossibly-high voltages and currents. Water hammer in a hydraulic system can crush equipment and cause explosions. The analyses that help us understand these physical phenomena, I’ll argue today, would also help us understand interactions in social systems and the effect of a “black swan event,” if we allow them to.

It’s the Stupid Economy

The sometimes-sizeable gap between “gut feel” and mathematical certainty is particularly common to complex systems. Coincidentally, our body politic is eager to tackle the most complex of systems, attempting to control them through taxation and regulation. The global climate and national economies seem to be a recent, and often interconnected, favorite. I shall leave the arguments of climate science and engineering to others and, today, focus on the economy. When it comes to the politics of the economy, I have noticed a pattern. At the intersection of economics and politics, the thinking is shockingly short-term. Shocking, because the economic environment may be the number one predictor for the outcome of an election. A good economy strongly favors the incumbents whereas economic misery almost guarantees a changing of the guard. You would think that if the economic conditions are what matter most to us, when it comes to our one contribution to the governance of society, we’d be eager to get it right. Yet, what seems to matter most are the economic conditions on the day of the polling. Four years of economic growth doesn’t mean much if the economy tanks on the 30th of October.

In something of a mixed blessing, the recent political free-for-all has challenged this shortsightedness, at least somewhat. President Obama, for years, blamed his predecessor for recession and deficit spending, despite a negative economic climate persisting for years into his term. He even famously took credit for the positive economic indicators during his successor’s term. His opponents, of course, sought to do the opposite. The truth is far more nuanced than any care to admit, but at least popular culture is broaching the subject. Most of us know that if “the economy” is looking up the day after the President signs off on a new initiative, it wasn’t his signature that did it. Or, more accurately, it can’t possibly account for the entirety of the impact, which may take months or years to reveal its full effect.

Going Viral

We have a further advantage when it comes to talking about the interaction between the economy, the novel coronavirus, and the resultant economic shutdown. The media has inundated us with bell curves and two-week lags. Most of us can appreciate the math that says if a Governor closes bars and restaurants today, we shouldn’t yet be looking for the results in our statistics tomorrow. Nonetheless, our collective grasp of dynamic systems and probabilities is tenuous under the best of circumstances. Mix in high levels of fear and incessant media hype, and even things that should be obvious become lost in the surrounding clamor.

Shift the playing field to economics and the conversation gets even murkier.

“The economy” is at the high-end of complex and chaotic systems. It is, after all, not an external system that we can observe and interact with, nor is it subject to the laws of physics. Rather, it is the collective behavior of all of us, each and every individual, and how we interact with each other to produce, consume, and exchange. Indeed, one might speculate on where the boundaries really lie. It seems a bit insensitive to label everything as “economic activity” during a health crisis, but what is it that we can exclude? Waking and sleeping, each of us are in the process of consuming food, water, clothing, and shelter. Most social interactions also involve some aspect of contract, production, or consumption. Even if we can isolate an activity that seems immune to all that, all human activity still occurs within that structure that “society,” and thus “the economy,” provides.

Within that framework, anyone who claims to “understand” the economy is almost certainly talking about a simplified model and/or a restricted subset of economic activity. Either that, or they are delusional. Real economic activity cannot be understood. Even if the human mind was vastly more capable, the interaction of every human being on the planet is, quite simply, unpredictable. Because of this, we use proxies for economic activity as a way to measure health and the effects of policy. GDP and GDP-growth are very common. Stock market performance substitutes for economic health in most of our minds and in the daily media. Business starts, unemployment numbers, average wages – each of these are used to gauge what is going on with the economy. However, every one of these metrics is incomplete at best and, more often than not, downright inaccurate in absolute terms.

Of course, it isn’t quite as bad as I make it out to be. GDP growth may contain plenty of spurious data, but if we seek to understand what is included and not included, and apply it consistently, we can obtain feedback that guides our policymaking. For example, we could assume that noisy prices associated with volatile commodities are not relevant to overall inflation numbers, or we can exclude certain categories when calculating GDP for the purpose of determining inflation. As long as we’re comparing apples to apples, our policy will be consistent.

What happens, though, when we get the economic equivalent of a hydraulic shock? In this case, our models of the economic world no longer apply and the world enters into an entirely unpredicted and unpredictable realm. We know this. What I want to explore, however, is what happens to our ability to “control” that system. My guess is it fails. It fails because we, again collectively, don’t appreciate the characteristics of dynamic systems. Yes, we understand it in terms of the heuristics we’ve traditionally used. Interest rates have to be raised before inflation kicks in to keep it from spiraling out of control. But what inflation will result from a $5 trillion stimulus at a time of 30% unemployment? Do we need higher or lower interest rates? In other words, when our traditional metrics fail us, will we truly appreciate the complex nature of the system?

In Control

During our imposed down-time, I re-watched an excellent film about the now-10-plus-year-old financial crisis induced by the housing market. The film The Big Short was made in 2015 based on Michael Lewis’ 2010 book of the same name. It dramatizes the subprime housing market collapse as seen by a handful of investors who saw it coming. As much as the story seems, today, in our distant past, there are those among us who feel that what we witnessed in 2008 was just the opening chapters in a longer tale. Whether a housing crisis is our past or our future, there are lessons to be applied to the present day.

The film’s story opens in 2005. Investor Michael Burry, reading the details of mortgage-backed security prospectuses, determines that the housing market is unstable and the financial instruments built upon it are doomed to fail. Unable to take a contrarian financial position using existing instruments, he commissions the creation of the Credit Default Swap to allow him to bet that the mortgage market will fail. The film concludes when Burry, and several others who bet against the housing market, liquidate their positions at a profit, sometime after Spring, 2008. The real-life Burry had actually been analyzing data from 2003 and 2004 before making his predictions and his commitment. Burry later wrote a piece for the New York Times saying that the housing market failure was predictable as much as four or five years out.

Putting this another way, by 2004 or 2005, the massive financial crisis of 2008-2010 had already happened, we just didn’t realize it yet. One might argue that sometime in those intervening four years, sanity might have come over America’s banks and the prospective home-owners to whom they were lending, but of course it didn’t. The reasons are many why it didn’t; why perhaps it couldn’t. Thus the events that all-but-inevitably put us on the road to global financial advance happened [four, five, more?] years in advance of what we would consider the start of the crisis. Unemployment numbers didn’t recover until 2014. That implies that for the individual, perhaps someone becoming unemployed and being unable to find a new position circa 2014, the impact of the collapse may have taken more than a decade to manifest itself.

Again, let’s look at it from a different angle. Suppose I wanted to avoid the tragedy to that individual who, in 2014, became unemployed. Let’s imagine that, as a result of his lack of employment, he died. Maybe it was suicide or opioid addiction. Maybe the job loss turned into a home loss and his whole family suffered. Suppose as a policy maker, I wanted to take macro-economic action to avoid that unnecessary death. How soon would I have had to act? 2003? 2000? Sometime in the 1990s?

Next Time

All of this comes to mind today as a result of the talk I am seeing among my fellow citizens. People are angry, although that isn’t entirely new. Some are angry because their livelihoods have been shut down while others are angry that folks would risk lives and health merely to return to those livelihoods. In the vast majority of cases, however, the fear is about near term effects. Will my restaurant go bankrupt given the next few weeks or months of cash-flow? What will the virus do two weeks after the end of lockdown? Will there be a “second wave” next fall? A recent on-line comment remarked that, although the recovery phase would see bumps along the road, “We’ll figure it out. We always do.”

Statistically, that sentiment is broadly reflected in the population at large. A summary of poll data through the end of March (http://www.apnorc.org/projects/Pages/Personal-impacts-of-the-coronavirus-outbreak-.aspx) suggested similar thinking. A majority of those currently out-of-work see no problems with returning “once it’s over.” In fact, a majority figure that by next year they’ll be as good as or better off financially than they are now. Statements like “we’ll get through this and come out stronger than ever” can be very motivational, but extending that to all aspects of economic and financial health seems a bit blind.

We’re losing track of the macro-economic implications for the personally experienced trees. We’ve all seen the arguments. Is it better to let grandpa die so that the corner burger shack can open back up a few weeks earlier? The counter argument cites the financial impact of a keeping the economy mostly-closed-down for a few more weeks. This isn’t the point, though, is it? On all sides of the argument it seems that the assumption is that we can just flip everything back on and get back to business. We are oblivious to the admittedly unanswerable question – how much damage has already been done?

Unprecedented

Words like “historic” and “unprecedented” are tossed around like confetti, but not without reason. In many ways our government and our society have done things – already done things, mind you – that have never happened before in the history of man. At first, the “destruction” seemed purely financial. Restaurants being shut down meant a loss in economic activity; a destruction of GDP. But is that even a real thing? Can’t we just use a stimulus bill to replace what is lost and call it even? But as April turns into May, we’re starting to see stories of real and literal destruction, not just lost opportunity. Milk is dumped because it can’t be processed. Vegetables are plowed under. Beef and chickens are killed without processing. This is actual destruction of real goods. Necessary goods. How can this go away with a reopening and some forgivable loans?

None of the experience gained through the financial crises of my lifetime would seem to apply. Even the Great Depression, while correct in magnitude, seems to miss the mark in terms of methodology. We’re simultaneously looking at a supply shock, a consumer depression, and inflationary fiscal policy. It’s all the different flavors or financial crisis, but all at the same time. Imagine a hydraulic shock in some rotating equipment where the control system itself has encountered a critical failure. I’ve decided that, for me, the best comparison is the Second World War. Global warfare pulled a significant fraction of young men out of the workforce, many never to return. Shortages ravaged the economy, both through the disruption of commerce as well as the effects of rationing. A sizable percentage of the American economic output was shipped overseas and blown up; gone.

Yet we got through it. We always do.

But we did so because we were willing to make sacrifices for the good of the nation and the good of the free world. We also lost a lot of lives and a lot of materiel. If “we” includes the citizens of Germany or the Ukraine, the devastation to society and culture was close to total, depending on where they called home. So, yes, civilization came through the Second World War and, as of a year or so ago, were arguably better than ever, but for far too many that “return to normalcy” took more than a generation. Will that be the price we have to pay to “flatten the curve?”

In the fall of 1915, after ten years of analysis, Albert Einstein presented his gravitational field equations of general relativity in a series of lectures at the Royal Prussian Academy of Sciences. The final lecture was delivered on November 25th, 104 years ago.

Yet it wasn’t until a month or so ago that I got a bug up my butt about general relativity. I was focused on some of the paradox-like results of the special theory of relativity and was given to understand, without actually understanding, that the general theory of relativity would solve them. Not to dwell in detail on my own psychological shortcomings, but I was starting to obsess about the matter a bit.

Merciful it was that I came across The Perfect Theory: A Century Of Geniuses And The Battle Over General Relativity when I did. In its prologue, author Pedro G. Ferreira explains how he himself (and others he knows in the field) can get bitten by the Einstein bug and how one can feel compelled to spend the remainder of one’s life investigating and exploring general relativity. His book explains the allure and the promise of ongoing research into the fundamental nature of the universe.

The Perfect Theory tells its story through the personalities who formulated, defended, and/or opposed the various theories, starting with Einstein’s work on general relativity. Einstein’s conception of special relativity came, for the most part, while sitting at his desk during his day job and performing thought experiments. He was dismissive of mathematics, colorfully explaining “[O]nce you start calculating you shit yourself up before you know it” and more eloquently dubbing the math “superfluous erudition.” His special relativity was incomplete in that it excluded the effects of gravity and acceleration. Groundbreaking though his formulation of special relativity was, he felt there had to be more to it. Further thought experiments told him that the gravity and acceleration were related (perhaps even identical) but his intuition failed to close the gap between what he felt had to be true and what worked. The solution came from representing space and time as a non-Euclidean continuum, a very complex mathematical proposition. The equations are a thing of beauty but also are beyond the mathematical capabilities of most of us. They have also been incredibly capable of predicting physical phenomena that even Albert Einstein himself didn’t think were possible.

From Einstein, the book walks us through the ensuing century looking at the greatest minds who worked with the implications of Einstein’s field equations. The Perfect Theory reads much like a techno-thriller as it sets up and then resolves conflicts within the scientific world. The science and math themselves obviously play a role and Ferreira has a gift of explaining concepts at an elementary level without trivializing them.

Stephen Hawking famously was told that every formula he included in A Brief History of Time would cut his sales in half. Hawking compromised by including only Einstein’s most famous formula, E = mc². Ferreira does Hawking one better, including only the notation, not the full formula, of the Einstein Tensor in an elaboration on Richard Feynman’s story about efforts to find a cab to a Relativity conference as told in Surely You’re Joking, Mr. Feynman. The left side of that equation can be written as, G_μν. This is included, not in an attempt to use the mathematics to explain the theory, but to illustrate Feynman’s punch line. Feynman described fellow relativity-conference goers as people “with heads in the air, not reacting to the environment, and mumbling things like gee-mu-nu gee-mu-nu”. Thus, the world of relativity enthusiasts is succinctly summarized.

The most tantalizing tidbit in The Perfect Theory is offered up in the prologue and then returned to at the end. Ferreira predicts that this century will be the century of general relativity, in the same way the last century was dominated by quantum theory. It is his belief we are on the verge of major new discoveries about the nature of gravity and that some of these discoveries will fundamentally change how we look at and interact with the universe. Some additional enthusiasm shines through in his epilogue where he notes the process of identifying and debunking a measurement of gravitational waves that occurred around the time the book was published.

By the end of the book, his exposition begins to lean toward the personal. Ferreira has an academic interest in modified theories of gravity, a focus that is outside the mainstream. He references, as he has elsewhere in the book, the systematic hostility toward unpopular theories and unpopular researchers. In some cases, this resistance means a non-mainstream researcher will be unable to get published or unable to get funding. In the case of modified gravity, he hints that this niche field potentially threatens the livelihood of physicists who have built their careers on Einstein’s theory of gravity. In fact, it wasn’t so long ago that certain aspects of Einstein’s theory were themselves shunned by academia. As a case in point, the term “Big Bang” was actually coined as a pejorative for an idea that, while mathematically sound, was too absurd to be taken as serious science. Today, we recognize it as a factual and scientific description of the origin of our universe. Ferreira shows us a disturbing facet of the machinery that determines what we, as a society and a culture, understand as fundamental truth. I’m quite sure this bias isn’t restricted to his field. In fact, my guess would be that other, more openly-politicized fields exhibit this trend to an even greater degree.

Ferreira’s optimism is infectious. In my personal opinion, if there is to be an explosion of science it may come from a different direction than that which Ferreira implies. One of his anecdotes involves the decision of the United States to defund the Laser Interferometer Space Antenna (LISA), a multi-billion dollar project to use a trio of satellites to measure gravitational waves. To the LISA advocates, we could be buying a “gravitational telescope,” as revolutionary in terms of current technologies as radiotelescopy was to optical telescopes. The ability to see further away and farther back in time would then produce new insights into the origins of the universe. But will the taxpayer spend billions on such a thing? Should he?

Rather than in the abstract, I’d say the key to the impending relativity revolution is found in Ferreira’s own description of the quantum revolution of the past century. It was the engineering applications of quantum theory, primarily to the development of atomic weapons, that brought to it much of the initial focus of interest and funding. By the end of the century, new and practical applications for quantum technology were well within our grasp. My belief is that a true, um, quantum leap forward in general relativity will come from the promise of practical benefit rather than fundamental research.

In one of the last chapters, Ferreira mentions that he has two textbooks on relativity in his office. In part, he is making a point about a changing-of-the-guard in both relativity science and scientists, but I assume he also keeps them because they are informative. I’ve ordered one and perhaps I can return to my philosophical meanderings once I’m capable of doing some simple math. Before I found The Perfect Theory, I had been searching online for a layman’s tutorial on relativity. Among my various meanderings, I stumbled across a simple assertion; one that seems plausible although I don’t know if it really has any merit. The statement was something to the effect that there is no “gravitational force.” An object whose velocity vector is bent (accelerated) by gravitational effects is, in fact, simply traveling a straight line within the curvature of timespace. If I could smarten myself up to the point where I could determine the legitimacy of such a statement, I think I could call that an accomplishment.

The not-so-Friendly Skies

This past weekend, the Wall St. Journal published a front-page article detailing the investigation into the recent, deadly crashes of Boeing 737 MAX aircraft. It is a pretty extensive combination of information that I had seen before, new insights, and interviews with insiders. Cutting to the chase, they placed a large chunk of the blame on a regulatory structure that puts too much weight on the shoulders of the pilots. They showed, with a timeline, how many conflicting alarms the pilots received within a four-second period. If the pilots could have figured out the problem in those four seconds and taken the proscribed action, they could have saved the plane. The fact that the pilots had a procedure that they should have followed means the system fits within the safety guidelines for aircraft systems design.

Reading the article, I couldn’t help but think of another article that I read a few months back. I was directed to the older article by a friend, a software professional, on social media. His link was to an IEEE article that is now locked behind their members-only portal. The IEEE article, however, was a version of a blog post by the author and that original post remains available on Medium.

This detailed analysis is even longer than the newspaper version, but also very informative. Like the Wall St. Journal, the blog post traces the history behind the design of the hardware and software systems that went into the MAX’s upgrade. Informed speculation describes how the systems of those aircraft caused the crash and, furthermore, how those systems came to be in the first place. As long as it is, I found it well worth the time to read in its entirety.

On my friend’s social media share, there was a comment to the effect that software developers understand the underlying systems for which they are writing software. My immediate reaction was a “no,” and it’s that reaction I want to talk about here. I’ll also point out that Mr. Travis, the blog-post author and a programmer, is not blaming programmers or even programming methodology per se. His criticism is at the highest level; for the corporate culture and processes and for the regulatory environment which governs these corporations. In this I generally agree with him, although I could probably nitpick some of these points. But first, the question of the software developers and what they should, can, and sometimes don’t understand.

There was a time, in my own career and (I would assume) in the career of the author, that statements about the requisite knowledge of programmers made sense. It was probably even industry practice to ensure that developers of control system software understood the controls engineering aspects of what they were supposed to be doing. Avionics software was probably an exception, rather than the rule, in that the industry was an early adopter of formal processes. For much of the software-elements-of-engineering-systems industry, programmers came from a wide mix of backgrounds and a key component of that background was what programmers might call “domain experience.” Fortran could be taught in a classroom but ten years worth of industry experience had to come the hard way.

Since we’ve been reminiscing about the artificial intelligence industry of the 80s and 90s, I’ll go back there again. I’ve discussed the neural network state-of-the-art, such as it was, of that time. Neutral networks were intended to allow the machines to extract information about the system which the programmers didn’t have. Another solution to the same category of problems, again one that seemed to hold promise, was a category called expert systems, which was to directly make use of those experts who did have the knowledge that the programmers lacked. Typically, expert systems were programs built around a data set of “rules.” The rules contained descriptions of the action of the software relative to the physical system in a way that would be intuitive to a non-programmer. The goal was to be a division of labor. Software developers, experts in the programming, would develop a system to collect, synthesize, and execute the rules in an optimized way. Engineers or scientists, experts in the system being controlled, would create those rules without having to worry about the software engineering.

Was this a good idea? In retrospect, maybe not. While neural networks have found a new niche in today’s software world, expert systems remain an odd artifact found on the fringes of software design. So if it isn’t a good idea, why not? One question I remember being asked way-back-when got to the why. Is there anything you can do with a rule-based system that you couldn’t also implement with standard software techniques? To put it another way, is my implemented expert system engine capable of doing anything that I couldn’t have my C++ team code up? The answer was, and I think obviously, “no.” Follow that with, maybe, a justification about improved efficiencies in terms of development that might come from the expert systems approach.

Why take this particular trip down memory lane? Boeing’s system is not what we’d classify as AI. However, I want to focus on a particular software flaw implicated as a proximate cause of the crashes; the one that uses the pitch (angle-of-attack) sensors to avert stalls. Aboard the Boeing MAX, this is the”Maneuvering Characteristics Augmentation System” (MCAS). It is intended to enhance the pilot’s operation of the plane by automatically correcting for, and thereby eliminating, rare and non-intuitive flight conditions. Explaining the purpose of the system with more pedestrian terminology, Mr. Travis’ blog calls it the “cheap way to prevent a stall when the pilots punch it” system. It was made a part of the Boeing MAX as a way to keep the airplane’s operation the same as how it always has been, using feedback about the angle-of-attack to avoid a condition that could occur only on a Boeing MAX.

On a large aircraft, the pitch sensors are redundant. There is one on each side of the plane and both the pilot and the co-pilot have indicators for their side’s sensor. Thus, if the pilot’s sensor fails and he sees a faulty reading, his co-pilot will still be seeing a good reading and can offer a different explanation for what he is seeing. As implemented, the software is part of the pilot’s control loop. MCAS is quickly, silently, and automatically doing what the pilot would be doing, were he to have noticed that the nose of the plane was rising toward a stall condition at high altitude. What it misses is the human interaction between the pilot and his co-pilot that might occur if the nose-up condition were falsely indicated by a faulty sensor. The pilot might say, “I see the nose rising too high. I’m pushing the nose down, but it doesn’t seem to be responding correctly.” At this point, the co-pilot might respond, “I don’t see that. My angle-of-attack reports normal.” This should lead them to deciding the pilot should not, in fact, be responding from the warning produced by his own sensor.

Now, according to the Wall St. Journal article, Boeing wasn’t so blind as to simply ignore the possibility of a sensor failure. This wasn’t explained in the Medium article, but there are (and were) other systems that should have alerted a stricken flight crew to an incompatible difference in values between the two angle-of-attack sensors. Further, there was a procedure, called the “runaway stabilizer checklist” that was to be enacted under that condition. Proper following of that checklist (within the 4 second window, mind you) would have resulted in the deactivation of the MCAS system in reaction to the sensor failure. But why not, instead, design the MCAS system to either a) take all available, relevant sensors as input before assuming corrective action is necessary or b) take as input the warning about conflicting sensor reading? I won’t pretend to understand what all goes into this system. There are probably any number of reasons; some good, some not-so-good, and some entirely compelling; that drove Boeing to this particular solution. It is for that reason I led off using my expert system as an analogy; since I’m making the analogies I can claim I understand, entirely, the problem that I’m defining.

Back then, a fellow engineer and enthusiast for technologies like expert systems and fuzzy logic (a promising technique to use rules for non-binary control) explained it to me with a textbook example. Imagine, as we’ve done before, you have a self-driving car. In this case, its self-driving intelligence uses a rule-based expert system for high-level decision making. While out and about, the car comes to an unexpected fork in the road. In computing how to react, one rule says to swerve left and one says to swerve right. In a fuzzy controller, the solution to conflicting conclusions might be to weight and average the two rule outputs. As a result, our intelligent car would elect to drive on, straight ahead, crashing into the tree that had just appeared in the middle of the road. The example is oversimplified to the point of absurdity, but it does point out a particular, albeit potential, flaw with rule-based systems. I also think it helps explain, by analogy, the danger lurking in the control of complex systems when your analysis is focused on discrete functions.

With the logic for your system being made up of independent components, the overall system behavior becomes “emergent” – a combination of the rule base and the environment in which it operates. In the above case, each piece of the component logic dictated swerving away from the obstacle. It was only when the higher-level system did its stuff that the non-intuitive “don’t swerve” emerged. Contrasting rule based development with more traditional code design, the number of possible states may be indeterminate by design. Your expert input might be intended to be partial, completed only when synthesized with the operational environment. Or look at it by way of the quality assuredness problem it creates. While you may be creating the control system logic without understanding the entire environment within which it will operate, wouldn’t you still be required to understand, exhaustively, that entire environment when testing? Otherwise, how could you guarantee what the addition of one more expert rule would or wouldn’t do to your operation?

Modern software engineering processes have been built, to a large extent, based on an understanding that the earlier you find a software issue, the cheaper it is to solve. A problem identified in the preliminary, architectural stage may be trivial. Finding and fixing something during implementation is more expensive, but not as expensive as creating a piece of buggy software that has to be fixed either during the full QA testing or, worse yet, after release. Good design methodologies also eliminate, as much as possible, the influence that lone coders and their variable styles and personalities might have upon the generation of large code bases. We now feel that integrated teams are superior to a few eccentric coding geniuses. This goes many times over when it comes to critical control systems upon which people’s lives may depend. Even back when, say, an accounting system might have been cobbled together by a brilliant hacker-turned-straight, avionics software development followed rigid processes meant to tightly control the quality of the final product. This all seems to be for the best, right?

Yes, but part of what I see here is a systematization that eliminates not just the bad influences of the individual, but their creative and corrective influence as well. If one person had complete creative control over the Boeing MAX software, that person likely would never have shipped something like the MCAS reliance on only one of a pair of sensors. The way we write code today, however, there may be no individual in charge. In this case, the decision to make the MCAS a link between the pilot’s control stick and the tail rudder rather than an automated response at a higher level isn’t a software decision; it’s a cockpit design decision. As such it’s not only outside of the purview of software design, but perhaps outside of the control of Boeing itself if it evolved as a reaction to a part of the regulatory structure. In a more general sense, though, will the modern emphasis on team-based, structured coding methodology have the effect of siloing the coders? A small programming team who has been assigned a discrete piece of the puzzle not only doesn’t have responsibility for seeing the big picture issues, those issues won’t even be visible to them.

In other words (cycling back to that comment on my friend’s posting many months ago), shouldn’t the software developers understand the underlying systems for which they are writing software? Likely, the design/implementation structure for this part of the system would mean that it wouldn’t be possible for a programmer to see that either a) the sensor they are using as input is one of a redundant pair of sensors and/or b) there is separate indication that might tell them whether the sensor they are using as input is reliable. Likewise the large team-based development methodologies probably don’t attract to the avionics software team the programmer who is also a controls engineer who also has experience piloting aircraft – that ideal combination of programmer and domain expert that we talked about in the expert system days. I really don’t know whether this is an inevitable direction for software development or if this is something that is done for better or for worse as we look at different companies. If the latter, the solutions may simply be with culture and management within software development companies.

So far, I’ve mostly been explaining why we shouldn’t point the figure at the programmers, but neither of the articles do. In both cases, blame seems to be reserved for the highest levels of aircraft development; at the business level and the regulatory level. The Medium article criticizes the use of engineered solutions to allow awkward physics as solutions to business problems (increasing the capacity of an existing plane rather than, expensively, creating a new one). The Wall St. Journal focuses on the philosophy that pilots will respond unerringly to warning indicators, often under very tight time constraints and under ambiguous and conflicting conditions. Both articles would tend to fault under-regulation by the FAA, but heavy-handed regulation may be just as much to blame as light oversight. Particularly, I’m thinking of the extent to which Boeing hesitated to pass information to the customers for fear of triggering expensive regulatory requirements. When regulations encourage a reduction in safety, is the problem under-regulation or over-regulation?

Another point that jumped out at me in the Journal article is that at least one of the redesigns that went into the Boeing MAX was driven by FAA high-level design requirements for today’s human-machine interfaces for aircraft control. From the WSJ:

[Boeing and FAA test pilots] suggested MCAS be expanded to work at lower speeds so the MAX could meet FAA regulations, which require a plane’s controls to operate smoothly, with steadily increasing amounts of pressure as pilots pull back on the yoke.

To adjust MCAS for lower speeds, engineers quadrupled the amount the system could repeatedly move the stabilizer, to increments of 2.5 degrees. The changes ended up playing a major role in the Lion Air and Ethiopian crashes.

To put this in context, the MCAS system was created to prevent an instability at some high altitude conditions, conditions which came about as a result of larger engines that had been moved to a suboptimal position. Boeing decided that this instability could be corrected with software. But if I’m reading the above correctly, there are FAA regulations focused on making sure a fly-by-wire system still feels like the mechanically-linked controls of yore, and MCAS seemed perfectly suited to help satisfy that requirement as well. Pushing this little corner of the philosophy too far may have been a proximate cause of the Boeing crashes. Doesn’t this also, however, point to a larger issue? Is there a fundamental flaw with requiring that control systems artificially inject physical feedback as a way to communicate with the pilots?

In some ways, it’s a similar concern to what I talked about with the automated systems in cars. In addition to the question whether over-automation is removing the connection between the driver/pilot and the operational environment, there is, for aircraft, an additional layer. An aircraft yoke’s design came about because it directly linked to the control surfaces. In a modern plane, the controls do not. Today’s passenger jet could just as well use a steering wheel or a touch screen interface or voice-recognition commands. The designs are how they are to maintain a continuity between the old and the new, not necessarily to provide the easiest or most intuitive control of the aircraft as it exists today. In addition, and by regulatory fiat apparently, controls are required to mimic that non-existent physical feedback. That continuity and feedback may also be obscuring logical linkages between different control surfaces that could never have existed when the interface was mechanically linked to the controlled components.

I foresee two areas where danger could creep in. First, the pilot responds to the artificially-induced control under the assumption that it is telling him something about the physical forces on the aircraft. But what if there is a difference? Could the pilot be getting the wrong information? It sure seems like a possibility when feedback is being generated internally by the control system software. Second, the control component (in this case, the MCAS system) is doing two things at once; stabilizing the aircraft AND providing realistic feedback to the pilots by feel through the control yoke. Like the car that can’t decide whether to swerve right or left, such a system risks, in trying to do both, getting neither right.

I’ll sum up by saying I’m not questioning the design of modern fly-by-wire controls and cockpit layouts; I’m not qualified to do so. My questions are about the extent to which both regulatory requirements and software design orthodoxy box in the range of solutions available to aircraft-control designers in a way that limits the possibilities of creating safer, more efficient, and more effective aircraft for the future.

Artificial, but Intelligent? Part 2

I just finished reading Practical Game AI Programming: Unleash the power of Artificial Intelligence to your game. My conclusion is there is a lot less to this book than meets the eye.

For someone thinking of purchasing this book, it would be difficult to weigh that decision before committing. The above link to Amazon has (as of this writing) no reviews. I’ve not found any other, independent evaluations of this work. Perhaps you could make a decision simply by studying the synopsis of this book before you buy it. Having done that, it is possible that you’d be prepared for what it offered. Having read the book, and then going back and reading the Amazon summary (which is taken from the publisher’s website), I find that it more or less describes the book’s content. In my case, I picked this book up as part of a Humble Book Bundle, so it was something of an impulse buy. I didn’t dig too hard into the description and instead worked my way through the chapters with my only expectations being based on the title.

Even applying the highest level of pre-purchase scrutiny only gets you so far. The description may indicate that the subject matter is of interest, but it is still a marketing pitch. It gives you no idea of the quality of either the information or the presentation. Furthermore, I think someone got a little carried away with their marketing hype. The description also tosses out some technical terms (e.g. rete algorithm, forward chaining, pruning strategies) perhaps meant to dazzle the potential buyer with AI jargon. The problem is, these terms don’t even appear in the book, much less get demonstrated as a foundation for game programing. I feel that no matter how much upfront research you did before you bought, you’d come away feeling you got less than you bargained for.

What this book is not is an exploration of artificial intelligence as I have discussed that term previously on this website. This is not about machine learning or generic decision-making algorithms or (despite the buzz words) rule-engines. The book mentions applications like Chess only in passing. Instead, the term “AI” is used as a gamer might. It discusses a few tricks that a game programmer can use to make the supposedly-intelligent entities within a game appear to have, well, intelligence when encountered by the player.

The topic that it does cover does, in fact, have some merit. The focus is mostly on simple algorithms and minimal code required to create the impression of intelligent characters within a game. Some of the topics I found genuinely enlightening. The overarching emphasis on simplicity is also something that makes sense for programmers to aspire to. There is no need to program a character to have a complex motivation if you can, with only a few lines of code, program him to appear to have such complex motivation. It is just that I’m not sure that these lessons qualify as “unleashing the power of Artificial Intelligence” by anyone’s definition.

But even before I got that far, my impression started off very bad. The writing in this book is rather poor, in terms of grammar, word usage, and content. In some cases, misused words become so jarring as to make it difficult to read through a page. Elsewhere, there will be several absolutely meaningless sentences strung together, perhaps acknowledging that a broader context is required but not knowing how to express it. At first, I didn’t think I was going to get very far into the book. After a chapter or so, however, reading became easier. Part of it may be my getting used to the “style,” if one can call it that. Part of it may also be that there is more “reaching” in the introductory and concluding sections but less when writing about concrete points.

I can’t say for sure but it is my guess, based on reading through the book, that the author does not use English as his primary language. I sometimes wondered if the text was written first in another language and then translated (or mistranslated, as the case may be) into English. Beyond that, the book also does not seem to have benefited from the work of a competent editor.

The structure of the chapters, for the most part, follows a pattern. A concept is introduced by showcasing some “classic” games with the desired behavior. Then some discussion about the principle is followed by coding example, almost always in Unity‘s C# development environment. This is often accompanied by screenshots of Unity’s graphics, either in development mode or in run-time. Most of the chapters, however, feel “padded.” Screenshots are sometimes repetitious. Presentation of the code is done incrementally, with each new addition requiring the re-printing of all of the sample code shown so far along with the new line or lines added in. By the end of the chapter, adding a concept might consist of two explanatory sentences, 3 screenshots, and two pages of sample code, 90% of which is identical to the sample code several pages earlier in the book. This is not efficient and I don’t think it is useful. It does drive the page count way up.

I want to offer a caveat for my review. This is the first book I’ve read from this publisher. When reading about some of their other titles, it was explained that the books come with sample source code. If you buy the book directly from the publisher’s website (which I did not), the sample code is supposed to be downloaded along with the book text. If you buy from a third party, they provide a way to register your purchase on the publisher’s site to get access to the downloads. I did not try this. If this book does have downloadable samples that can be loaded into Unity, and those samples are well-done, that has a potential for adding significant value over the book on its own.

Back to the chapters. When I start going through the chapters, again it feels like there is some “padding” going on to make the subject matter seem more extensive than it is. The book starts with two chapters on Finite State Machines FSM and how that logic can be used to drive an “AI” character’s reactions to the player. Then the book takes a detour into Unity’s support for a Finite State Machine implementation of animations, which has its own chapter. This is mostly irrelevant to the subject of game AI and also, likely, of little value if you’re not using Unity.

After the animation chapter, we head back into the AI world with a discussion of the A*, and the Theta* variant thereof, pathfinding algorithm. This discussion is accompanied by a manual optimization solution of a simple square-grid based 2D environment, describing each calculation and illustrating each step. I do appreciate the concrete example of the algorithm in action. Many explanations of this topic I’ve found on-line simply show code or pseudo-code and leave it to the “student” to figure it all out. In this case, I think he managed to drive the page count up by and order of magnitude over what would have been sufficient to explain it clearly.

The final chapters show how Unity’s colliders and raycasting can be used to implement both collision avoidance and vision/detection systems. These are two very similar problems involving reacting to other objects in the environment that, themselves, can move around. As I said earlier, there are some useful concepts here, particularly in emphasizing a “keep it simple” design philosophy. If you can use configurable attributes on your development tool’s existing physics system to do something, that’s much preferable to generating your own code base. That goes double if the perception for the end user is indistinguishable, one method from the other. However, I also get the feeling that I’m just being shown some pictures of simple Unity capabilities, rather than “unleashing the power of AI” in any meaningful sense.

A few years back, I was trying to solve a similar problem, but trying to be predictive about the intent of the other object. For example, if I want to plot an intercept vector to a moving target but that target is not, itself, moving at a constant rate or direction, I need a good bit more math than the raycasting and colliders provide out of the box. Given the promise of this book’s subject matter, that might be a problem I’d expect to find, perhaps in the next chapter.

Alas, after discussing the problem of visual detection involving both direction and obstacles, the book calls an end to its journey. With the exception of the A* algorithm, the AI solutions consist almost entirely of Unity 3D geometry calls.

Although the book claims to be written in a way such that each chapter can be applied to a wide range of games, I feel like it narrows its focus as it progresses. The targeted game is, and I struggle with how to describe it so I’ll just pick an example, the heirs to the DOOM legacy. By this, I mean games where the player progresses through a series of “levels” in order to complete the game. What the player encounters through those levels is imagined and created by the designer so as to construct the story of the game. The term AI, then, distinguishes between different kinds of encounters, at least as far as the player perceives them. For example, the player might find herself rushing across a bridge, which starts to collapse when they reach the middle. This requires no “AI.” It is simply programmed in that, when the player reaches a certain point on the bridge, call the “collapseBridge” routine. If she makes it past the bridge and into the next chamber, where there are a bunch of gremlins that want to do her in, the player starts considering the “AI” of those gremlins. Do they react to what the player does, adopting different tactics depending on her tactics? If so, she might praise the “AI.” By the book’s end, the focus is entirely on awareness of and reaction between mobile elements of a game which, by defining the problem as such, is the subset of games in this category.

My harping on the narrow focus of this book goes to the determination of its value. If this book were free or very low cost, you would have to decide whether the poor use of English and the style detract from whatever useful information is presented. The problem with that is the price this book asks. The hardcopy (paperback) of the book is $50.00. The ebook is $31.19 on Amazon, discounted to $28 if you buy directly from the publisher’s site. All of those seem like a lot of money, per my budget. Now, my own price I figure to have been $7. I bought the $8 bundle package over the $1 package purely based on interest in this title. This is the first book in that set I’ve read, so if some of the others are good, I might consider the cost to be even lower. Still, even at $5, I feel like I’ve been cheated a bit by the content of this book.

The bundle contained other books from this same publisher, so I’ll plan to read at least one other before drawing any conclusions about their whole library. Assuming that the quality of this book is, in fact, an outlier, this is still a risk to the publisher’s reputation. When one of your books is overpriced and oversold, the cautious buyer should assume that they are all overpriced and oversold. Looking at the publisher’s site, this book has nothing but positive reviews. It’s really a blemish on the publisher as a whole.

Although I won’t go so far as to say “I wish I hadn’t wasted the time I spent reading this,” I can’t imagine any purchaser for whom this title would be worth the money.

A Plague of Frogs

Veni, Vidi, Ranae Cecinere