Why Does Intelligence Require Networks?

Reticula, or networks, are not necessarily intelligent, of course. In fact, most aren’t. But all forms of intelligence require neworks, as far as I can tell.

Why?

There are lots of articles and books about how networks are the basis for natural and machine intelligence. But for the purpose of this post, I want to avoid those and see if I can get down to first priciples based on what I already know.

Define Intelligence

To me, intelligence is the ability to understand one’s environment well enough to achieve one’s goals, either through influencing that environment or knowing when not to.

Networks Are Efficient Forms of Complexity

Let’s specify that intelligence requires complexity. I have some thoughts about why that is, but let’s just assume it for now.

If you need complexity for intelligence, then networks are a good way to go. After all, networks are groups of nodes and links. Each node has two or more links…sometimes a lot more. In this post, I’m referring to larger networks.

In such networks, there are many possible routes to take to get from one part of the network to another. All these different routes will tend to have different lengths.

Why is that important? First, each length will have a nonstandard characteristic, both in terms of time and space.

Second, depending on the size and variability of the network, the patterns of that route may be as unique as a fingerprint or snowflake.

Third, this complexity is efficiently created using a relatively small amount of matter. That is, it just requires stringy links rather than large blocks of matter into which are carved intricate pathways. This efficiency is useful for animals designed to carry around intelligence in a relatively small package like the brain.

Something Must Move Within the Network

Intelligence does not only require a complex physical or virtual network of some sort, it also requires something that moves within that network. In the case of biological and machine intelligence, what moves is electricity.

I don’t know if electricity is the only possible means or agency of movement. For example, maybe one could build an intelligence made up of a complex array of tubes and nodes powered by moving water. Maybe any sufficiently complex adaptive system has the potential for intelligence if it has enough energy (of some sort) to carry out its functionality.

But electricity does seem like a very useful medium. It works at the atomic level, and it depends on the movement of electrons. Electrons are a component of all atoms, and therefore they are a natural means of transformation at the atomic scale.

In the end, though, this may be less a matter of energy than of information.

Information Is Key to Reticular Intelligence

One way or another, information is exchanged between neurons in the brain. They seem to be much more complex than simple logic gates, but the idea is similar. Some unique pathway is followed as an electrical pulses flashes through a specific part of the network. Maybe it forms some kind of value or string of chemical interactions.

Assuming they exist, I don’t know how such values would be determined, though we can imagine a lot of possible variables such as length of the pathways, strength of the pulse, shape of the neurons, etc. Regardless, I can envision that the values or interactions would be based on the unique nature of each circuit in the network.

These somehow allow us to experience “reality.” We don’t know if this reality has any objective nature. But, somehow the perception of this reality allows us to continue to operate, so these interpretations are useful to our continued existence.

Maybe what we experience is more like a GUI interface on a computer. We aren’t sure what is happening in the background of the computer (that is, in objective reality, assuming there is one), but we know what we see on the screen of our minds. And, although that screen may bear little resemblance to true reality, it does help us interface with it in useful ways.

Our Experience of a Red Tomato

I don’t know if my experience of the color red is the same as your experience. But however my mind interprets it, the color serves as useful signal in nature (and in human society).

So let’s imagine that I am looking at tomatoes on a vine. I pick the red ones because I deem those as ripe. Ripe ones taste best and may have the highest degree of nutritional value. The signal of red comes from a particular pattern in the network of my brain. Other parts of the network give me things like shape and texture. All these things are stored in a different part of the network.

When I see the requisite colors and shapes on a vine, all of these network patterns light up at once, giving me a specific value that is interpreted by my brain as a ripe tomato.

Influencing My Environment

When my neural network discerns what it interprets as a ripe tomato, other parts of network are brought into the picture. They tell me to reach out to the tomato and pick it. If it is small enough, maybe I pop it in my mouth. If it is larger, maybe I put it into a bag and bring it into the house.

These actions demonstrate some form of intelligence on my part. That is, I am able to influence my environment in order to meet my goal of pleasure and allievating hunger (which helps survival).

The Variability of the Network

I think the complexity of the network is necessary because of the complexity of the environment around me. My particular path to survival is a higher (or, at least different) intelligence than that of many other beings on the planet.

That doesn’t mean that another animal could not survive with a much more limited neural network. There are many animals that make do with much less complex ones and, I assume, less intelligence. But they have found a niche in which a more limited intelligence serves them very well in the survival game.

Plants do not seem to require a neural network at all, though it is clear they still have things like memory and perhaps agency. The network of their bodies contains some type of intelligence, even if it is what we would consider a low level.

But if your main survival tactic is being more intelligent than the other beings on the planet, then a substantial neural net is required. The neural net somehow reflects a larger number of ways to influence and interpret the environment. The more complex the network, the better it is at establishing a wide range of values that can be interpreted to enhance survival.

Summing Up

There’s so much I don’t know. I need to read more of the literature on neural nets. But even there I know I’ll bump up against a great many unknowns, such as how our experience of reality–our qualia, if you like–emerges from the bumps, grinds and transformations of the “dumb” matter in our brains.

Still, this exercise has helped me refine my intuition on why intelligence is linked to networks, though there’s still a lot that I can’t explain short of referencing the magic and miraculous.

Why ChatGPT Is NOT Just a Fancy Autocomplete

I suspect one of the biggest myths of our time is that ChatGPT and its fellow large language models are just fancy autocomplete programs. This widespread impression could be blinding us to the true capabilities and power of these AIs, both now and in the future.

The Predict-Next-Word Method

As most people know by now, these generative pre-trained transformer (or GPT) large language models are built on the idea of predicting the next word in sequence of words. That sounds simple, right?

So simple, in fact, that it’s led many people to conclude that these programs are not truly intelligent, much less sentient or conscious. All that might, in fact, be true. Still, we should stop assuming they’re as simple as all that. We need to look beyond the “predict next word” methology and consider the deep complexity of the resulting neural networks.

Human Intelligence Was Built on a Simple Binary

Before getting into the details of the scaling hypothesis, which potentially sheds light on the “predict next word” issue, let’s discuss the origin of our own intelligence.

Human intelligence, such as it is, is based on one of the simplest binaries possible: reproduce or not. Our ancesters, the first living cells on the planet, did not need to be intelligent to survive. They just needed to figure out a way to reproduce before perishing. Even today, there are many organisms that are decendants of those first cells and probably no more intelligent than they were at the time.

by Talonnn
Binary operations as black box

Then there’s us. Our intelligence was not inevitable. In fact, it is just one of an almost infinite number of paths to reproductive success.

So, when we say that the new AIs are only “fancy autocorrects,” consider that we are only fancy reproduction machines. You could even argue that the need to predict the next word in a sentence is a more complicated and difficult feat than the ones that sparked our own evolution.

So, perhaps we should stop denigrating the “predict next word” challenge. That challenge is just the evolutionary mechanisim of these AIs. The ones that do that prediction best (that is, today’s GPT models) have survived into current versions, being trained, tweaked and calibrated by AI researchers to improve their success rates. The rest have been left behind. That may not, despite our helping hand, be all that different from our own path.

Prediction Machines

We don’t know how intelligent these new AIs are. They sometimes seem bafflingly bright, othertimes dumb and delusional. In that way, I suppose, they are a lot like people.

Of course, lot of people will claim they know and promptly jump into rancorous debates on the subject (see Twitter or the comment sections in major newspapers). But even the builders of ChatGPT don’t seem sure. In fact, Ilya Sutskever, chief scientist of the OpenAI research group, tweeted at one point that “it may be that today’s large neural networks are slightly conscious.”

Slightly conscious? The fact we aren’t sure is the part that frightens some people (and by some people, I mean me). We are dealing with difficult cognitive and philosophical questions that, far from being relegated to the halls of academia, suddenly have very real implications and consequences.

What we do know is that the AIs are good at prediction. Indeed, this is at the heart of what they do. We also know that some thinkers believe that prediction is at the heart of our own cognition.

Remember Jeff Hawkins? He wrote, “The brain creates a predictive model. This just means that the brain continuously predicts what its inputs will be. Prediction isn’t something that the brain does every now and then; it is an intrinsic property that never stops, and it serves an essential role in learning. When the brain’s predictions are verified, that means the brain’s model of the world is accurate. A mis-prediction causes you to attend to the error and update the model.”

Does that sound familiar? If prediction is what we do and what the GPTs do, perhaps a little humility is in order.

The Scaling Hypothesis

Now let’s go to a blog post by Gwern Branwen. Before I get into that, though, I’ll stipulate what’s no doubt obvious to any experts who might read this: that is, this isn’t my world, not by a long shot. I stumbled onto Branwen’s blog only because Yudkowsky mentioned him by name in his interview with AI professor and podcaster Lex Fridman. I can’t vouch for the accuracy of the ideas in Branwen’s post, written in what strikes me as an idiosyncratic way.

If you want to truly judge and understand Branwen’s piece “The Scaling Hypothesis,” it’s best to read it yourself. But I’m going to provide a few highlights of what seem (to me) to be two key ideas.

In essence, his post is about the possibility that one can achieve an artificial general intelligence, or AGI, simply by scaling up a GPT model, even if the training objective of that model remains pretty simple, as in “predict the next word.”

Branwen writes, “The blessings of scale … sup⁣port a rad⁣i⁣cal the⁣ory: an old AI par⁣a⁣digm held by a few pi⁣o⁣neers in con⁣nec⁣tion⁣ism (early ar⁣ti⁣fi⁣cial neural net⁣work re⁣search) and by more re⁣cent deep learn⁣ing re⁣searchers, the scal⁣ing hy⁣poth⁣e⁣sis. The scal⁣ing hy⁣poth⁣e⁣sis re⁣gards the bless⁣ings of scale as the se⁣cret of AGI: in⁣tel⁣li⁣gence is ‘just’ sim⁣ple neural units & learn⁣ing al⁣go⁣rithms ap⁣plied to di⁣verse ex⁣pe⁣ri⁣ences at a (cur⁣rently) un⁣reach⁣able scale. As in⁣creas⁣ing com⁣pu⁣ta⁣tional re⁣sources per⁣mit run⁣ning such al⁣go⁣rithms at the nec⁣es⁣sary scale, the neural net⁣works will get ever more in⁣tel⁣li⁣gent.”

Bing’s Take

I had a discussion with the Bing chatbot (based on ChatGPT-4) to help determine if I was undertanding this right. Here’s a piece of the conversation.

Me: So, is the idea that one can achieve an artificial general intelligence by scaling up the system, even if the training objective is fairly simple, such as “predict the next word”?

Bing: Yes, that is one interpretation of the scaling hypothesis. The idea is that by training ever larger networks on simple tasks such as language modeling (predicting the next word), the network will eventually learn to perform more sophisticated behaviors and generalize to a wide range of tasks . Would you like more information on this topic?

Me: Tell me more about the Scaling Hypothesis.

Bing: The Scaling Hypothesis is an idea put forward by Gwern Branwen that suggests that once we find a scalable architecture for artificial intelligence, we can simply train ever larger networks and ever more sophisticated behavior will emerge naturally as the easiest way to optimize for all the tasks and data .

According to this hypothesis, as we increase the size of the network and the amount of data and computation used to train it, the performance of the network will improve in a predictable and smooth manner. This means that even if we start with a simple task such as language modeling (predicting the next word), as we scale up the network it will eventually learn to perform more sophisticated behaviors and generalize to a wide range of tasks.

This hypothesis has been supported by recent developments in artificial intelligence, such as the success of large language models like GPT-3. These models have demonstrated an ability to perform a wide range of tasks despite being trained on a simple objective (predicting the next word) .

How Far Can It Scale?

This suggests that the only limitation to an ever more intelligent super-mind is compute power and parameters. Now, that may be wrong. Perhaps we’ll someday hit an S-curve somewhere along the way. But, maybe not….

Branwen writes, “Depend⁣ing on what in⁣vest⁣ments are made into scal⁣ing [deep learning], and how fast com⁣pute grows, the 2020s should be quite in⁣ter⁣est⁣ing—sig⁣moid or sin⁣gu⁣lar⁣ity?”

Um, okay.

Do you think this is scary?

Wait, it gets worse — or better, depending on your point of view.

The Emergence of Agency

Let’s talk about “agency” in the context of AI. It refers to a program’s capacity to make choices and act on its own recognizance. In other words, it is the ability to exert control or influence over a situation.

Looking at the GPT models, Branwen writes, “GPT-3 clearly shows agency because it learns generative models of many agents, real or fictional, which can be used to ‘roleplay’—plan and take action which will steer environments into small goal regions of state-space; and this is not merely hypothetical, or confined to text transcripts of actions & results in its internal simulated environments but given effectors, like in the case of SayCan, a language model will in fact do such things in the real world.”

Okay, that’s a bit hard to parse but let me give it a go. He’s saying that ChatGPT-3, as we’ve come to know it, demonstrates the ability to make “choices” (or something like them) and act on those choices. For example, when we ask it to take on the persona of a real or fictional character, it will make choices in the way it subsequently handles language.

Moreover, if you were to hook it up to a robot through a control method such as SayCan — which can generate natural language actions for a robot based on a user’s request — then it could take action in the real world. In other words, the robot could make something like choices and act accordingly.

The Robot Acts on Its Own

I’m not sure about the accuracy of this interpretation of GPT’s agency, but I think that’s approximately the idea. Via a GPT model, agency is emergent. You don’t build it in. It’s a “ordinary continuum of capability.” Branwen concludes that “a very wide range of problems, at scale, may surprisingly induce emergent agency.”

In short, agency happens. It’s hard to remove from the AI. He claims, “The broader and more powerful a system is, the more the next feature or next piece of data may push it over the edge, and it becomes harder to engineer a system without that aspect.”

I don’t want to say that a GPT-enabled robot has “free will,” whatever that actually means. But it might naturally have its own sense of agency.

When AIs Break Bad, Who Is Responsible?

This is not, of course, the first time the topic of AI agency has arisen. Various papers have raised the question of whether AI systems can make decisions on their own. One author argues that we need to think about what humans want an AI to do (that is, their human goals), when we try to figure out who is responsible for any mistakes an AI makes.

That paper talks about whether AI systems (like robots) can make decisions on their own, or whether they need humans to tell them what to do. The author argues that we need to think about what humans want the AI to do (their goals), when we try to figure out who is responsible for any mistakes the AI makes.

But others are starting to think about AIs as having moral agency aside from humans. In fact, a 2017 European Parliament report floated the idea of granting special legal status to robots that can learn, adapt, and act for themselves. “This legal personhood would be similar to that already assigned to corporations around the world,” reports Business Insider, “and would make robots, rather than people, liable for their self-determined actions, including for any harm they might cause.”

Thinking Uncomfortable Thoughts

How “smart” would a machine need to get before it has not just agency but moral responsibility for that agency?

I’ve no idea. We should note that Branwen’s blog post discusses what the public refers to as ChatGPT-3. OpenAI has now moved past that. In fact, his post seems to have anticipated the latest scaling up. By some estimates, ChatGPT-4 includes one trillion parameters, compared with just 175 billion in ChatGPT-3. Other estimates are that it includes up to a 100 trillion parameters.

What’s are parameters? I don’t have a deep understanding myself, but they are essentially the level of complexity of these systems. Our World in Data defines parameters as “variables in an AI system whose values are adjusted during training to establish how input data gets transformed into the desired output; for example, the connection weights in an artificial neural network.”

The more complex the network, the smarter the system. This sounds a lot like how the human brain works, though I’m sure many experts would claim that’s both a faulty and oversimplistic analogy. Maybe so, but the size and sophistication of the AI reticulum does seem to matter an awful lot.

Therefore, for now, it makes a lot less sense to talk about these systems as fancy autocompletes and a lot more sense to talk about them as increasingly enormous networks (that happen to think at lightning speed). This may give us a much better idea of their intelligence or, if you prefer, their ability to mimic intelligence. Understanding the difference, if there is one, is among the most critical challenges of our day.

Postscript

If you’re seeking a more technical and detailed look into how ChatGPT works, I recommend Stephen Wolfram’s article “What Is ChatGPT Doing … and Why Does It Work?” It’s quite long but a compelling read if you want grasp the mechanics of ChatGPT. He concludes, “What ChatGPT does in generating text is very impressive—and the results are usually very much like what we humans would produce. So does this mean ChatGPT is working like a brain? Its underlying artificial-neural-net structure was ultimately modeled on an idealization of the brain. And it seems quite likely that when we humans generate language many aspects of what’s going on are quite similar….[On the other hand], unlike even in typical algorithmic computation, ChatGPT doesn’t internally ‘have loops’ or ‘recompute on data.’ And that inevitably limits its computational capability—even with respect to current computers, but definitely with respect to the brain.It’s not clear how to ‘fix that’ and still maintain the ability to train the system with reasonable efficiency. But to do so will presumably allow a future ChatGPT to do even more ‘brain-like things.'”

Are Humans Still the Smartest Beings on the Planet?

These new AIs are smart. Or at least seem to be. After all, they are excelling at a wide range of tests typically used to gauge human knowledge and intelligence. Which leads me to ask, “Are humans still the smartest beings on the planet?”

Maybe Not

There are some reasonable and growing arguments that we’re no longer the most intelligent entities on the planet. Let’s go to the exams.

Even before the company Open AI launched its ChatGPT-4 chatbot, which is considered considerably more capable than ChatGPT-3, a study looked at the AI’s ability to match humans in three key areas: general knowledge, SAT exam scores, and IQ.

The outcome? ChatGPT wound up in a higher percentile than humans in all three areas.

AI expert and author Dr. Alan D. Thompson suggests that GPT-3 displays an IQ above 120. If ChatGPT were a human being, it would fall into the “gifted” category, according to Thompson.

And then there’s ChatGPT-4. Open AI has published extensive data about how it performs on a wide range exams. For example, the firm claims that the AI passes a simulated bar exam (that is, the one that tests knowledge and skills attorneys should have before becoming licensed to practice law) “with a score around the top 10% of test takers,” which compares very favorably with GPT-3’s score, which was around the bottom 10%.

Maybe We Never Were

Of course, one might argue that we never really were the smartest beings on the planet. We don’t have a way to truly gauge the intelligence, for example, of the huge-brained whales, some of which live for up to 200 years.

I explored this in a previous post, so I won’t delve too deeply into the details. But the truth is that we can only guess at the intelligence of cetaceans such as the humpback, beluga and killer whales as well as various dolphins and porpoises.

Maybe the Question Makes No Sense

One of the interesting things about the large language model (LLM) AIs is that we’ve trained them on human langugage. Lots and lots of it. We are language-using animals par excellence. Now, we’ve harnessed machine learning to create tools that at least imitate what we do through the use of neural nets and statistics.

We don’t typically say that we are dumber than a calculator, even though calculators can handle mathematics much better than we typically can. Nor do we say we are “weaker” than a bulldozer. Perhaps we are just shouldn’t apply the word intelligence to these recticular models of AI. What they do and what we do may not be truly comparable.

Maybe So, For Now

I’m certainly no expert but have had considerable experience with ChatGPT and Bing chat. I was an early adopter in both cases and have seen how humblingly smart and yet puzzlingly dense they can be.

For example, I’ve had to convince ChatGPT that the year 1958 came well after the World War II, and I’ve seen Bing be stubbornly wrong about prime numbers and basic multiplication. In other cases, I’ve asked Bing to give me information on a topic from the last week and it’s given me articles several years old.

As for the AI art generators, they are also amazing yet often can’t seem to count limbs or digits or even draw human hands in a non-creepy way.

In other words, there are times when these systems simply lack what we might consider common sense or fundamental skills. We can’t yet trust them to get the details right in every instance.

At the same time, of course, the LLMs are able to write rather good prose on virtually any topic of your choice in seconds. Imagine knowing just about everything on the Internet and being able to deftly and almost instantly weave that information together in an essay, story or even poem. We don’t even have a word for that capability. Savant would not cover it.

Once these systems truly develop “common sense,” however we define that, there will be precious few tasks on which we can best them. Perhaps they are still a long way from that goal, but perhaps not.

Maybe We’re Just Being Extended

In that past, I’ve written about the “extended human” and Kevin Kelly’s idea of the technium, which he discusses in his book What Technology Wants. Many people would not call any one of these LLM AIs a “being” at all. Rather, they’d say they are still just tools made up of silicon, fiber-optic cables and electronic blips of 0s and 1s with no consciousness or even sentience at all. They are little more than mechancial parrots.

In this view, the LLMs are glorified search engines that put together word pattens with no more thought than when a series of ocean waves create elegant undulating patterns of sand on the beach. These machines depend on our words, ideas and works of art in order to “think” at all, so they are mere extensions of our own intellects: bulldozers of human symbols, so to speak.

Maybe It Doesn’t Matter

Maybe they will out-intellect us by wider and wider margins, but perhaps it doesn’t really matter if we are no longer the smartest entities on the planet.

For decades, some scholars have argued that we can’t compare our intellects to those of other beings: anthills and beehives, corvids and cephalopods, elephants and grizzly bears. Each animal’s intellect is uniquely good at the things that keep them alive.

Squirrels are geniuses at remembering where they’ve hidden their acorns and negotiating the complexities forest canopies. We can’t do what they do, but does that make us their inferior?

No, comparing the two is nonsense, this argument goes.

The AIs will never be better humans than humans because we are uniquely ourselves. Perhaps the era of AIs will give us both the humility and the wisdom to finally understand this.

Which is all fine and well until, of course, the machines learn to dominate our world just as humanity has done in the recent past. If this happens, perhaps we will need to learn to live in their shadows, just as squirrels and crows and coyotes have lived in ours.

Feature image from CrisNYCa, April 17, 2018. Le Penseur (The Thinker) in the garden of Musée Rodin, Paris