AI as UX

What it means in practical terms to claim a system uses "artificial intelligence"

Mar 02, 2023

I was talking to my friend Kieran about the argument I made in my last newsletter, about AI being UX. That is to say, "artificial intelligence"1 is really, fundamentally, a metaphor for user interaction. I think it's worth expanding. It's worth noting, first of all, that machine learning—ML—isn't UX. There are machine learning systems—most of them, really—that are simply cranking quietly away answering questions. These systems provide answers to questions about the statistical organization of datasets without carrying any particular metaphorical baggage. You feed an answer in, it runs through what is more-or-less a black box, you get an answer out. The central insights that allow these systems to operate—statistical learning theory, among others—date back to the late 1970s, although many of them were first explicated in the Soviet Union so they didn't really make it to the west in coherent form until the early 1990s. Even then, the computing power necessary to implement anything more than the simplest statistical learning engines did not yet exist. By the early 2000s practical applications of machine learning were possible, and a dozen years or so after that computing power was sufficient to power the deep learning revolution, where tens or even hundreds of thousands of training samples could be efficiently parsed, their statistical regularities captured in models with thousands and then tens and hundreds of thousands of parameters. None of this, as it happened, had much of anything to do with "artificial intelligence".

That term—"artificial intelligence"—was in real disrepute from the early 1970s all the way through the first decade of this century. It was conceived—in this country, in parallel with but entirely separate from the developments in machine learning happening in the Soviet Union—at a moment of tremendous optimism over the usefulness of information theory for understanding cognitive systems. Its original definition was specific: a symbolic system capable of abstract reasoning. Then-current developments in the nascent field of cognitive science provided a blueprint, and computers provided the necessary superstructure. The project, of creating artificial intelligence, was foreseen to take no more than five years or so.

This effort, as you might imagine, failed quite robustly, obliterating a whole industry as it went. The broken promise of early artificial intelligence efforts was so comprehensive that grants dried up, labs closed, companies folded, and the name and approach were shunned. The previously fertile common ground between cognitive scientists and computer scientists was abandoned, with the computer scientists dismissively skeptical of the cognitive models that had shown themselves to be practically useless. The emergence of machine learning—a parallel but differentiated approach that didn't try for anything is highfalutin' as abstract reasoning—was welcome in computer science, as it seeded AI's barren plain with a promising crop of practically useful algorithms and meaningful evaluative criteria.

The term "artificial intelligence" didn't go away. It was too cool a phrase, fundamentally. Pop culture was more interested in computer science as a source for dramatically interesting robot pals than as an oracle of provably optimal separating hyperplanes for practical classification problems. Most engineers working on machine learning felt, at some level, the same way. But if the approach of building symbolic systems capable of abstract reasoning had failed—and make no mistake, it comprehensively had—what did "artificial intelligence" mean?

The answer is... well, the answer has been a lot of things. One of the reasons the "Turing Test"—a relatively dashed-off thought experiment from 73 years ago—continues to have so much currency is that it turns out to be fiendishly difficult to establish what human intelligence actually comprises. Clearly there is something different about us humans, but pinning down what it is has been a fearsome and rancorous process even for behavioral scientists who spend all their time studying the topic. This has left room for even more abstract and arguably (that is, I argue this, but a lot of people seem to disagree with me) outlandish ideas, such as that intelligence is mathematically equivalent to data compression.

In the meantime, the definition of "artificial intelligence" as a practical term has devolved to marketing. It turned out that after a diet of decades of popular culture portraying AI as a sort of computational MacGuffin to allow robots and computers to be fully fleshed-out, human-indistinguishable characters, branding a solution "artificial intelligence" rather than "machine learning" or, god forbid, "statistical inference" is both compelling and lucrative.

Saying that "artificial intelligence" is now a marketing term does not mean it is meaningless. In fact, by foregoing the precision that the original AI researchers attempted to deploy in their definition, "artificial intelligence" as a term became much more evocative, powerful, and pernicious.

The loose definition of "artificial intelligence" means, more or less, "like a human". An interactive system that describes itself as "AI" is a system where the designers are implying that you can interact with it the same way that you would interact with a human. An information retrieval system that is "artificially intelligent" is one where you can expect that it will answer questions the same way that a research assistant would if you asked them. In other words, instead of being a set of formal criteria that define a system as having some key condition that defines "intelligence", "AI" has become a metaphor.

More specifically it has become a user metaphor. Ever since the early days of personal computing—going back to the decision to call discrete units of user-accessible data "files", but really getting rolling with the dawn of graphical user interfaces like those on the Macintosh, or Microsoft Windows—most interaction of humans with computers has been under the aegis of analogies between the virtual, digital system and the familiar, analog, physical world. Files, commands, windows, folders, menus, pointers, browsers, pages: real world objects have proven tremendously useful cognitive models for understanding the otherwise fairly abstract architecture (metaphor!) of computing systems. Even the name "computer" itself was a metaphor, aligning these new electronic number processing systems with the humans—computers—who could be tasked with complex mathematical reasoning questions as opposed to the simple adding machines and calculators they were supplanting.

There is a whole field of user experience research dedicated to understanding how user metaphors guide people's interaction with computers. The researcher Don Norman has written influentially about "affordances", adapting the work of the psychologist J.J. Gibson. In Gibson's theory, people perceive the world around them not in terms of what it is, but in terms of what can be done with it, how it can be interacted with. A window is perceived, immediately and instantly, as a thing to be seen through, or opened. A folder is perceived immediately and instantly as a container for papers. User metaphors allow the person at the computer to understand, directly and without explicit training, how the computer can be used.

One of the ways that people describe affordances is that it is the user interface telling you how it wants to be used. It makes a certain intuitive sense. If the user interface is designed to tell you how to use it, that means it wants you to use it that way. Often, the frustration of using a bad user interface is expressed in terms like "argh, what does it want me to do?" Our tendency to anthropomorphize is potent enough to easily extend to inanimate things—I mentioned the IKEA lamp last time—and suffuses our interactions with the world even when we don't realize it. You could almost argue that the imputing of intentionality and affordances are the same idea, or at least very closely linked. If you see something or somebody in the world, you leap to an intuition about how that something or somebody want you to interact with it, or them.

So where does this leave "AI" systems like ChatGPT? "AI" means a system that maximally welcomes the assumption of intentionality. An AI—an intelligent agent like a human—must want things from you, must have some goal in interacting with you, must have its own inner life that is guiding its choices in its interactions. That's how the system works. Except that, of course, that's not how the system works at all. The AI has no underlying intention besides giving you an answer that you find congenial. It's a sheer, blank wall of agreeableness.

I think this disconnect is at the heart of why the metaphor eventually breaks. We are finely tuned to see intention anywhere it might exist, so a user interface whose affordances are maximally welcoming of that impression has to walk an incredibly fine line. On the one hand, if users buy the metaphor that they are dealing with, you know, an intelligence, they will impute all kinds of capabilities the system likely doesn't have—accepting the flood of confident misinformation from ChatGPT, for instance—or the metaphor will break, and they will instead start trying to figure out how to subvert the system and make it do what they want. For the latter, see the current rage for manipulating LLMs not with intuitive language but by subverting that interface with prompt injection and prompt engineering. Both of these outcomes are bad in their own ways, and both of those outcomes—with differences specific to specific applications—are likely in any situation where interactive software systems are deployed with "AI" as the central metaphor, including in autonomous cars.

This essay entirely leaves to the side the question of “AGI”, or “artificial general intelligence”, which is the current buzzword since OpenAI’s Sam Altman wrote a frankly fairly funny and no little bit tendentious letter about how OpenAI is preparing for it. That’s on purpose, both because the use of the term introduces whole new layers of likely misunderstanding of what machine learning systems do and do not do—it’ll be worth discussing at some point—and also because, even more than usual in the machine learning world, “AGI” is a fantasy.

Apperceptive (moved to buttondown)

Ready for more?