AI Won’t Be Just for White-collar Workers

AR for Everybody!

Over the next six to twelve months, we’re going to see a surge in usage of and spending on augmented reality (AR) glasses that are enhanced by GPT-based artificial intelligences (AI). Why? Because blue-collar workers and prosumers will ultimately get a lot of utility out of them.

For example, the Bing chatbot will likely empower HoloLens 2 users. Open AI’s ChatGPT-4 will probably be feeding directly into Vuzix brand smart glasses. And Facebook’s LLaMA AI will somehow be integrated into the not-yet-released Meta Quest 3 (formerly known as the Oculus Quest brand) and/or Raybans.

Although Google recently discontinued Glass Enterprise smartglasses, they’ll no doubt find AR glasses manufacturers to match up to Bard. Meanwhile, there’s no way Apple will be deprived its entry into the space.

Blue-Collar Work Gets Seriously Augmented

Why do I think this is going to happen? The main reason is that, unless there’s some huge crisis associated with one of these AIs in the meantime (it could happen), people at work finally have a legit reason to wear these gadgets, especially in blue-collar jobs.

I don’t mean that it’s going to happen overnight. It’ll start slowly and become more popular over time. But it’ll come because these AIs will all eventually be able to interpret images. In fact, ChatGPT-4 already has this capability.

This potentially changes the working lives not only of various types of blue-collar employees but of avid prosumers as well.

Bing Creates a Scenario

I asked Bing to create a scenario in which an electronics mechanic uses a HoloLens that’s enabled with an AI-powered image recognition module. I specified the person is an experienced electronics mechanic who already knows how to address basic jobs such as fixing a resistor.

Here’s what Bing came up with:

Alice arrives at the customer’s house and inspects the drone that is lying on the table. She notices that it is a DJI Mavic Air 2, a model that she has not worked on before. She turns on her HoloLens headset and says:

“Hello Bing, I need your help with repairing this drone. It has a damaged circuit board. Can you show me a schematic diagram of this model?”

Bing responds: “Sure, Alice. I have found a schematic diagram of the DJI Mavic Air 2 from the official website. Here it is.”

Bing projects a holographic image of the drone’s circuit board on Alice’s headset, highlighting the different components and their functions.

Alice examines the image and compares it with the actual circuit board. She sees that one of the capacitors is burned out and needs to be replaced.

She says: “Bing, can you tell me the specifications of this capacitor?”

Bing responds: “This capacitor is a 100 microfarad electrolytic capacitor with a voltage rating of 16 volts. It is used to smooth out the voltage fluctuations from the battery.”

Alice says: “Thanks, Bing. Do I have one of these in my toolbox?”

Bing responds: “Yes, you do. It is in the third compartment from the left, labeled C3.”

Alice opens her toolbox and finds the capacitor. She uses her soldering iron to remove the old capacitor and replace it with the new one.

She says: “Bing, can you check if I have done it correctly?”

Bing responds: “Yes, you have. The capacitor is soldered correctly and there are no short circuits or loose connections.”

Alice says: “Great. Is there anything else I need to do?”

Bing responds: “No, that’s all. You have successfully repaired the drone’s circuit board. You can now test it by turning on the drone and checking its functions.”

Alice says: “Thank you, Bing. You have been very helpful.”

Bing responds: “You’re welcome, Alice. I’m glad I could assist you. Have a nice day.”

After Bing provided this scenario, I asked, “How would Bing know what is and isn’t in her toolbox?”

It replied, “Bing knows what is and isn’t in Alice’s toolbox because she has scanned and registered her tools with the HoloLens headset before using it. Bing can access the information from the cloud storage where Alice’s data is stored and synced. Bing can also use the graphics recognition module to identify the tools visually and match them with the specifications.”

Okay, I’m not even going to dwell on the uncanniness of Bing’s ability to write a fiction and then seamlessly fill in any plot holes after the fact. The point is that we can see how an experienced professional might be able to use an AI-enhanced, image-recognition device.

Start with a Smartphone

Most of this will be done by smartphones in the near future. But people often need to work with both hands, so wearing the device like a pair of eyeglasses makes more sense in the long run.

So, yes, smartphones will come first, smart glasses second. The first movers will adopt glasses quickly, the fast followers the next year, and within five years smart glasses will be commonplace.

The Weekend Mechanic

Then there’ll be the prosumers, the folks who want the latest gear to pursue their weekend hobbies, whether that’s working on their car, building their electronic doodads, doing home improvement jobs, or anything else you can think of.

In truth, the prosumer market will probably be more lucrative than the pro market because A) there are more prosumers, and B) professional mechanics and other skilled workers already know how to do their jobs. They’ve developed habits that will be hard to break. But that doesn’t mean they won’t come around if they really believe the tech will help them.

AR Finally Gets Useful

Up until now, most use cases for AR glasses have been pretty limited. With the rise of these new AIs, however, everyone gets access to an enormous electronic brain jam-packed with detailed information.

What’s more, the AIs themselves will get better over time at understanding the real-world problems inherent in these professions. And, unlike most of us, the AIs aren’t likely to forget the details of what they’ve learned over time.

Eventually there will be talk of robots doing these kinds of work. That may happen, but long before then, there will be skilled workers as well as passionate weekend amateurs who are made better via AI-enabled smart glasses.

The Dangers of BS

But there’s also a danger here: that all forms of these AIs so far generate incorrect information at times. Some have even called them BS machines because they so confidently put forward made-up facts.

This is annoying for journalists, who must fact-check everything coming out of an AI. But it could potentially be deadly for people like mechanics, electricians and those who rely on their work.

So either these systems become more trustworthy over time (possibly by relying only on databases specific to certain professions), or workers must continue to rely on current quality assurance procedures, never putting full faith in AIs on the job. If the latter, some will argue against their usage in the first place, and they may carry the day in some professions or organizations. We’ll have to see how it plays out. I imagine there will be a tug of war in both directions.

A Summary of the “Godfather of artificial intelligence talks impact and potential of AI” Interview

This is an AI-enabled summary of an interview with cognitive psychologist and computer scientist Geoffrey Hinton. He’s played a big role in the development of computer neural networks and was the guest of Brook Silva-Braga on the CBS Saturday morning show. The YouTube video can be seen at the end of this summary. I added a couple of salient quotes that touch on the “alignment” problem. The art is by Bing’s Image Creator.

Hinton’s Role in AI History

Hinton discusses the current state of artificial intelligence and machine learning. He explains that his core interest is understanding how the brain works and that the current technique used in big models, backpropagation, is not what the human brain is doing. He also discusses the history of AI and neural nets, which he was a proponent of, and how neural nets have proven to be successful despite skepticism from mainstream AI researchers.

The video describes how ChatGPT has vast knowledge compared to a single person due to its ability to absorb large amounts of data over time. The model was first proposed in 1986 and was later able to surpass traditional speech recognition methods thanks to advancements in deep learning and pre-training techniques. Hinton’s background in psychology originally led him to neural networks, and his students’ research resulted in significant developments in speech recognition and object recognition systems.

Trepidations

The interview touches on various topics related to computer science and AI, such as the potential impact on people’s lives, the power consumption differences between biological and digital computers, and the use of AI technology in areas like Google search. Hinton also discusses the challenges of regulating the use of big language models and the need to ensure that AI is developed and used in a way that is beneficial to society (a need he doesn’t feel is being well met).

Silva-Braga: What do you think the chances are of AI just wiping out humanity? Can we put a number on that?

Hinton: It’s somewhere between 1 and 100 percent (laughs). Okay, I think it’s not inconceivable. That’s all I’ll say. I think if we’re sensible, we’ll try and develop it so that it doesn’t, but what worries me is the political situation we’re where it needs everybody to be sensible. There’s a massive political challenge it seems to me, and there’s a massive economic challenge in that you can have a whole lot of individuals who pursue the right course and yet the profit motive of corporations may not be as cautious as the individuals who work for them.

Hinton addresses the common criticism that large language models like GPT-3 are simply autocomplete models. He argues that these models need to understand what is being said to predict the next word accurately. In addition, they discuss the potential for computers to come up with their own ideas to improve themselves and the need for control. Hinton also addresses concerns about job displacement caused by these models, arguing that while jobs will change, people will still need to do the more creative tasks that these models cannot do.

Silva-Braga: Are we close to the computers coming up with their own ideas for improving themselves?

Hinton: Um, yes we might be

Silva-Braga: And then it could just go fast

Hinton: That’s an issue we have to think hard about, how to control that

Silva-Braga: Yeah, can we?

Hinton: We don’t know. We haven’t been there yet, but we can try.

Silva-Braga: Okay, that seems kind of concerning

Hinton: Um, yes

Overall, the interview provides insights into the current state and future of AI and machine learning, as well as the challenges and opportunities that come with their widespread use. It highlights the need for careful consideration and regulation to ensure that these technologies are developed and used in a way that benefits society.

To read a full transcript of the interview, go the original YouTube page (click on the three horizontal dots and then select “Show transcript”)

Why ChatGPT Is NOT Just a Fancy Autocomplete

I suspect one of the biggest myths of our time is that ChatGPT and its fellow large language models are just fancy autocomplete programs. This widespread impression could be blinding us to the true capabilities and power of these AIs, both now and in the future.

The Predict-Next-Word Method

As most people know by now, these generative pre-trained transformer (or GPT) large language models are built on the idea of predicting the next word in sequence of words. That sounds simple, right?

So simple, in fact, that it’s led many people to conclude that these programs are not truly intelligent, much less sentient or conscious. All that might, in fact, be true. Still, we should stop assuming they’re as simple as all that. We need to look beyond the “predict next word” methology and consider the deep complexity of the resulting neural networks.

Human Intelligence Was Built on a Simple Binary

Before getting into the details of the scaling hypothesis, which potentially sheds light on the “predict next word” issue, let’s discuss the origin of our own intelligence.

Human intelligence, such as it is, is based on one of the simplest binaries possible: reproduce or not. Our ancesters, the first living cells on the planet, did not need to be intelligent to survive. They just needed to figure out a way to reproduce before perishing. Even today, there are many organisms that are decendants of those first cells and probably no more intelligent than they were at the time.

by Talonnn
Binary operations as black box

Then there’s us. Our intelligence was not inevitable. In fact, it is just one of an almost infinite number of paths to reproductive success.

So, when we say that the new AIs are only “fancy autocorrects,” consider that we are only fancy reproduction machines. You could even argue that the need to predict the next word in a sentence is a more complicated and difficult feat than the ones that sparked our own evolution.

So, perhaps we should stop denigrating the “predict next word” challenge. That challenge is just the evolutionary mechanisim of these AIs. The ones that do that prediction best (that is, today’s GPT models) have survived into current versions, being trained, tweaked and calibrated by AI researchers to improve their success rates. The rest have been left behind. That may not, despite our helping hand, be all that different from our own path.

Prediction Machines

We don’t know how intelligent these new AIs are. They sometimes seem bafflingly bright, othertimes dumb and delusional. In that way, I suppose, they are a lot like people.

Of course, lot of people will claim they know and promptly jump into rancorous debates on the subject (see Twitter or the comment sections in major newspapers). But even the builders of ChatGPT don’t seem sure. In fact, Ilya Sutskever, chief scientist of the OpenAI research group, tweeted at one point that “it may be that today’s large neural networks are slightly conscious.”

Slightly conscious? The fact we aren’t sure is the part that frightens some people (and by some people, I mean me). We are dealing with difficult cognitive and philosophical questions that, far from being relegated to the halls of academia, suddenly have very real implications and consequences.

What we do know is that the AIs are good at prediction. Indeed, this is at the heart of what they do. We also know that some thinkers believe that prediction is at the heart of our own cognition.

Remember Jeff Hawkins? He wrote, “The brain creates a predictive model. This just means that the brain continuously predicts what its inputs will be. Prediction isn’t something that the brain does every now and then; it is an intrinsic property that never stops, and it serves an essential role in learning. When the brain’s predictions are verified, that means the brain’s model of the world is accurate. A mis-prediction causes you to attend to the error and update the model.”

Does that sound familiar? If prediction is what we do and what the GPTs do, perhaps a little humility is in order.

The Scaling Hypothesis

Now let’s go to a blog post by Gwern Branwen. Before I get into that, though, I’ll stipulate what’s no doubt obvious to any experts who might read this: that is, this isn’t my world, not by a long shot. I stumbled onto Branwen’s blog only because Yudkowsky mentioned him by name in his interview with AI professor and podcaster Lex Fridman. I can’t vouch for the accuracy of the ideas in Branwen’s post, written in what strikes me as an idiosyncratic way.

If you want to truly judge and understand Branwen’s piece “The Scaling Hypothesis,” it’s best to read it yourself. But I’m going to provide a few highlights of what seem (to me) to be two key ideas.

In essence, his post is about the possibility that one can achieve an artificial general intelligence, or AGI, simply by scaling up a GPT model, even if the training objective of that model remains pretty simple, as in “predict the next word.”

Branwen writes, “The blessings of scale … sup⁣port a rad⁣i⁣cal the⁣ory: an old AI par⁣a⁣digm held by a few pi⁣o⁣neers in con⁣nec⁣tion⁣ism (early ar⁣ti⁣fi⁣cial neural net⁣work re⁣search) and by more re⁣cent deep learn⁣ing re⁣searchers, the scal⁣ing hy⁣poth⁣e⁣sis. The scal⁣ing hy⁣poth⁣e⁣sis re⁣gards the bless⁣ings of scale as the se⁣cret of AGI: in⁣tel⁣li⁣gence is ‘just’ sim⁣ple neural units & learn⁣ing al⁣go⁣rithms ap⁣plied to di⁣verse ex⁣pe⁣ri⁣ences at a (cur⁣rently) un⁣reach⁣able scale. As in⁣creas⁣ing com⁣pu⁣ta⁣tional re⁣sources per⁣mit run⁣ning such al⁣go⁣rithms at the nec⁣es⁣sary scale, the neural net⁣works will get ever more in⁣tel⁣li⁣gent.”

Bing’s Take

I had a discussion with the Bing chatbot (based on ChatGPT-4) to help determine if I was undertanding this right. Here’s a piece of the conversation.

Me: So, is the idea that one can achieve an artificial general intelligence by scaling up the system, even if the training objective is fairly simple, such as “predict the next word”?

Bing: Yes, that is one interpretation of the scaling hypothesis. The idea is that by training ever larger networks on simple tasks such as language modeling (predicting the next word), the network will eventually learn to perform more sophisticated behaviors and generalize to a wide range of tasks . Would you like more information on this topic?

Me: Tell me more about the Scaling Hypothesis.

Bing: The Scaling Hypothesis is an idea put forward by Gwern Branwen that suggests that once we find a scalable architecture for artificial intelligence, we can simply train ever larger networks and ever more sophisticated behavior will emerge naturally as the easiest way to optimize for all the tasks and data .

According to this hypothesis, as we increase the size of the network and the amount of data and computation used to train it, the performance of the network will improve in a predictable and smooth manner. This means that even if we start with a simple task such as language modeling (predicting the next word), as we scale up the network it will eventually learn to perform more sophisticated behaviors and generalize to a wide range of tasks.

This hypothesis has been supported by recent developments in artificial intelligence, such as the success of large language models like GPT-3. These models have demonstrated an ability to perform a wide range of tasks despite being trained on a simple objective (predicting the next word) .

How Far Can It Scale?

This suggests that the only limitation to an ever more intelligent super-mind is compute power and parameters. Now, that may be wrong. Perhaps we’ll someday hit an S-curve somewhere along the way. But, maybe not….

Branwen writes, “Depend⁣ing on what in⁣vest⁣ments are made into scal⁣ing [deep learning], and how fast com⁣pute grows, the 2020s should be quite in⁣ter⁣est⁣ing—sig⁣moid or sin⁣gu⁣lar⁣ity?”

Um, okay.

Do you think this is scary?

Wait, it gets worse — or better, depending on your point of view.

The Emergence of Agency

Let’s talk about “agency” in the context of AI. It refers to a program’s capacity to make choices and act on its own recognizance. In other words, it is the ability to exert control or influence over a situation.

Looking at the GPT models, Branwen writes, “GPT-3 clearly shows agency because it learns generative models of many agents, real or fictional, which can be used to ‘roleplay’—plan and take action which will steer environments into small goal regions of state-space; and this is not merely hypothetical, or confined to text transcripts of actions & results in its internal simulated environments but given effectors, like in the case of SayCan, a language model will in fact do such things in the real world.”

Okay, that’s a bit hard to parse but let me give it a go. He’s saying that ChatGPT-3, as we’ve come to know it, demonstrates the ability to make “choices” (or something like them) and act on those choices. For example, when we ask it to take on the persona of a real or fictional character, it will make choices in the way it subsequently handles language.

Moreover, if you were to hook it up to a robot through a control method such as SayCan — which can generate natural language actions for a robot based on a user’s request — then it could take action in the real world. In other words, the robot could make something like choices and act accordingly.

The Robot Acts on Its Own

I’m not sure about the accuracy of this interpretation of GPT’s agency, but I think that’s approximately the idea. Via a GPT model, agency is emergent. You don’t build it in. It’s a “ordinary continuum of capability.” Branwen concludes that “a very wide range of problems, at scale, may surprisingly induce emergent agency.”

In short, agency happens. It’s hard to remove from the AI. He claims, “The broader and more powerful a system is, the more the next feature or next piece of data may push it over the edge, and it becomes harder to engineer a system without that aspect.”

I don’t want to say that a GPT-enabled robot has “free will,” whatever that actually means. But it might naturally have its own sense of agency.

When AIs Break Bad, Who Is Responsible?

This is not, of course, the first time the topic of AI agency has arisen. Various papers have raised the question of whether AI systems can make decisions on their own. One author argues that we need to think about what humans want an AI to do (that is, their human goals), when we try to figure out who is responsible for any mistakes an AI makes.

That paper talks about whether AI systems (like robots) can make decisions on their own, or whether they need humans to tell them what to do. The author argues that we need to think about what humans want the AI to do (their goals), when we try to figure out who is responsible for any mistakes the AI makes.

But others are starting to think about AIs as having moral agency aside from humans. In fact, a 2017 European Parliament report floated the idea of granting special legal status to robots that can learn, adapt, and act for themselves. “This legal personhood would be similar to that already assigned to corporations around the world,” reports Business Insider, “and would make robots, rather than people, liable for their self-determined actions, including for any harm they might cause.”

Thinking Uncomfortable Thoughts

How “smart” would a machine need to get before it has not just agency but moral responsibility for that agency?

I’ve no idea. We should note that Branwen’s blog post discusses what the public refers to as ChatGPT-3. OpenAI has now moved past that. In fact, his post seems to have anticipated the latest scaling up. By some estimates, ChatGPT-4 includes one trillion parameters, compared with just 175 billion in ChatGPT-3. Other estimates are that it includes up to a 100 trillion parameters.

What’s are parameters? I don’t have a deep understanding myself, but they are essentially the level of complexity of these systems. Our World in Data defines parameters as “variables in an AI system whose values are adjusted during training to establish how input data gets transformed into the desired output; for example, the connection weights in an artificial neural network.”

The more complex the network, the smarter the system. This sounds a lot like how the human brain works, though I’m sure many experts would claim that’s both a faulty and oversimplistic analogy. Maybe so, but the size and sophistication of the AI reticulum does seem to matter an awful lot.

Therefore, for now, it makes a lot less sense to talk about these systems as fancy autocompletes and a lot more sense to talk about them as increasingly enormous networks (that happen to think at lightning speed). This may give us a much better idea of their intelligence or, if you prefer, their ability to mimic intelligence. Understanding the difference, if there is one, is among the most critical challenges of our day.

Postscript

If you’re seeking a more technical and detailed look into how ChatGPT works, I recommend Stephen Wolfram’s article “What Is ChatGPT Doing … and Why Does It Work?” It’s quite long but a compelling read if you want grasp the mechanics of ChatGPT. He concludes, “What ChatGPT does in generating text is very impressive—and the results are usually very much like what we humans would produce. So does this mean ChatGPT is working like a brain? Its underlying artificial-neural-net structure was ultimately modeled on an idealization of the brain. And it seems quite likely that when we humans generate language many aspects of what’s going on are quite similar….[On the other hand], unlike even in typical algorithmic computation, ChatGPT doesn’t internally ‘have loops’ or ‘recompute on data.’ And that inevitably limits its computational capability—even with respect to current computers, but definitely with respect to the brain.It’s not clear how to ‘fix that’ and still maintain the ability to train the system with reasonable efficiency. But to do so will presumably allow a future ChatGPT to do even more ‘brain-like things.'”