Go to series

Accept our marketing cookies to access this content.

These cookies are currently disabled in your browser.

Davos 2024: How to Trust Technology

Scroll down for full podcast transcript - click the ‘Show more’ arrow

AI and immersive technologies will fundamentally change how humanity interacts with society, government and even the environment.

How can we meet the challenge presented by the complex risks we face while building trust in our technological future?

This is the full audio from a session at the World Economic Forum’s Annual Meeting 2024 on January 17, 2024.

Speakers:

Mustafa Suleyman, Chief Executive Officer, Microsoft AI
Ben Thompson, Founder, Stratechery
Ayanna Howard, Dean of Engineering, Ohio State University

Watch it here: https://www.weforum.org/events/world-economic-forum-annual-meeting-2024/sessions/town-hall-how-to-trust-technology/

Links:

World Economic Forum Centre for the Fourth Industrial Revolution:

https://centres.weforum.org/centre-for-the-fourth-industrial-revolution/home

AI Governance Alliance:

https://initiatives.weforum.org/ai-governance-alliance/home

Check out all our podcasts on wef.ch/podcasts:

YouTube: - https://www.youtube.com/@wef/podcasts

Radio Davos - subscribe: https://pod.link/1504682164

Meet the Leader - subscribe: https://pod.link/1534915560

Agenda Dialogues - subscribe: https://pod.link/1574956552

Join the World Economic Forum Podcast Club: https://www.facebook.com/groups/wefpodcastclub

Podcast transcript

This transcript has been generated using speech recognition software and may contain errors. Please check its accuracy against the audio.

Ben Thompson: Welcome to this town hall about how to trust technology, particularly in an age of AI. I added that last little bit but unsurprisingly, AI has been a major focus both here in Davos and in the world generally and I suspect that will be the case for basically forever.

So, this townhall is designed to be a forum for discussion and an opportunity for transparency. To that end, the primary driver of the discussion is the audience. So both you, here in the room and also online. After initial opening statements from our panelists who I will introduce in a moment, we will open the floor to questions.

Those of you in the room, you can simply raise your hand, we'll bring a mic to you, you can ask your question. Those of you online, you can use this app called Slido, which you can access at a link on the website for this session. As a quick introduction to Slido and as a way to sort of ground the discussion there are two poll questions that you can answer right now.

The two are first, in the past year, has your trust in technology increased or decreased? And second, in the past year, has your trust in organizations in general, not just in tech, increased or decreased. We will come back to the results in a bit.

Please note that you can also vote on the questions other people submit online to surface the best choices. For now though, let me introduce the panelists who will be answering your questions.

Over here on the right we have Dr. Ayanna Howard is the Dean of Engineering at the Ohio State University. She holds a faculty appointment in the college's department of electrical and computer engineering, as well as computer science and engineering. Before Ohio State, Dr. Howard was at Georgia Tech and before that NASA, where she worked on designing advanced technologies for future Mars rover missions. Dr Howard's research focuses on AI, technologies and robotics. And I think all of those are certainly very pertinent to the broader discussion and the discussion for this panel.

Here on my right is Mustafa Suleyman. Mustafa is co-founder and CEO of Inflection AI, an AI-first consumer software company. Before that he co founded DeepMind – a leading AI company that was acquired by Google and is integral to Google's AI efforts. As a part of Google, Mustafa was responsible for integrating the company's technology across across a wide range of Google's products. Mustafa is also a newly published author of a book called The Coming Wave published by crown in the US, September 2023. I think you can all guess what the book is about.

So with that, I would like to pass it on to our panelists to give sort of a brief opening statement about the topic in general about what they're working on. And so Dr Howard?

Ayanna Howard: All right, so he said brief but you know, I'm an academic so brief is all relative. So when I think about technology and trust, I think about the research that I do. So back in 2011, my research was focused on how do you evaluate trust, human trust, with respect to robotics in high critical time-sensitive scenarios. And so we picked emergency evacuation.

And so we had done scenarios where we would have this room and we have people interact and the fire alarms would go off and what would you do; you exit, as you exited, the building would be filled with smoke and there'll be a robot that would guide you to the exit. And we intentionally made the robot not go toward an exit – so you can see the exit signs – and we would have people go to other places.

And what we found out over and over and over again, even when we introduced mistakes, even when the robot had bad behaviour, people would follow the robot. And so when we looked at this to say what's going on, we actually found that people actually trust or overtrust technology, in that they believe it works most of the time. And when it doesn't, that's when they then swing to the "Oh, like aeroplanes should always fly, if they crash, OK, what's going on? It's the developers, it's the companies, we need a ban and make sure that no one ever flies again." And so we have this overreaction.

And so really thinking about technology and trust is how do we deal with the mistakes, not necessarily of the fact that we under trust, we tend to overtrust, in essence.

Mustafa Suleyman: Yeah, I think this is a critical topic for LLMs. Because we're still at the very earliest stages of developing these models. I mean, they are inherently probabilistic models. So as an intuitive grasp, compared to the past wave of technology, it's very important to separate what we have software do in the past, which was you input some data to a database and then you make a call of that database, and you collect pretty much the same information out of that database.

Whereas here, of course, with LLMs, you can ask the same question of the model three or four or five times and get a slightly different response. And many people have referred to this as hallucinations. I actually think about it more as creativity. This is one of the strengths of these models, that they produce a wide range of different responses that you never anticipated before. And that's exactly the magic that we've always wanted software to produce.

The downside is that our past mental model of default trusting technology, as you said, doesn't really apply in this situation. And so I think, at this moment, we have to be incredibly critical, sceptical, doubtful, ask tough questions of our technology. And you know, we can talk about the ways to do that today.

I think there's two elements that will drive trust in this moment, as we develop more consumer-facing applications. The first is, of course, the extent to which models are factual, it's IQ, right? And that's actually something that we can formally measure. So there are lots and lots of benchmarks today that try to evaluate the freshness and the factuality of models. And what we've seen in the last three or four years or so, is that there's been a steady and quite predictable increase in the accuracy of the models as they get larger.

The second component is the emotional connection that you have with a model. What is its EQ like? How fluent and conversational is it? How kind and respectful is it? Does it reflect your values? Or does it antagonize you? Is it formulaic? Or is it adaptive? And of course, many of us like to think that we're rational actors that make decisions based on knowledge and facts all the time. But in fact, we largely make decisions using our limbic system. And it's our gut that actually drives a lot of our decision-making. And that's going to be the same in technology, especially now that LLMs are these dynamic, interactive modes.

So we have to think very carefully about what values are in the emotional side of the model. Who gets to oversee them, how are they programmed to whom are they transparent and accountable? And those are the kinds of sceptical and critical questions that we should be asking of all new consumer technologies.

Ben Thompson: Dr Howard, I'm actually quite curious because you've worked in this field for so long, from robotics to AI and this shift that Mustafa's referring to, going from being deterministic to probabilistic, how has that shifted or changed the way you thought about these questions for both AI and robotics?

Ayanna Howard: So, I actually think, around the human aspect and so I just want to ask a question, how many of you have used, you know, ChatGPT, or Bard or some equivalent of that? Okay, that's, I would say, 100%. How many of you have used it in any form of function like to actually do a job do a work? OK, it hasn't changed, right. We know that there's mistakes, we know that it's not perfect. We know that, you know, lawyers put in briefs that are written by ChatGPT and they get slammed by judges, because it's incorrect. It really hasn't changed.

I think what's changed is that we now understand the risks. We haven't yet figured out how do we address that as a society because it is so useful, it is so valuable. And so when it's right, it really does make our work life better. But when it's wrong and we aren't using our human EQ to correct it, we have things that go bad. And my thing is, how do we think about the human EQ aspect and blend that into the tools? I actually think the robots in AI should say, "you know what, you've been using me too long, we're done today." I actually believe that.

Ben Thompson: From your perspective, how is this communicated? Like, how do you. You know, Dr Howard is concerned that people are too trusting; in our poll questions, we have the increase or decrease in trust in technology. And I actually asked to add in also, what's your increase and decrease relative to general institutions? Because it does feel like you're sort of getting at this point of revealed versus stated preferences.

People will talk about old tech as bad big tech XYZ and people use it all the time. And to your point, they really do implicitly trust it to a great extent. Is this something that needs to be communicated to people? Or is it just on technologists to get it to a state where that's okay and it will be we'll be right sufficiently enough at the time?

Mustafa Suleyman: I think both are true, right? So on the one side, so we created an AI called PI, which stands for Personal Intelligence. I'm a big believer that everybody in the world will have their own personal AI in the future and it will become your aid, your chief of staff, your teacher, your support as you navigate through life. And we made a couple of important decisions related to what Ayanna said.

Number one is on every screen we leave in place a system reminder that you shouldn't trust what the AI says, that you should be sceptical and ask critical questions. It's a small thing, but it's a constant ever-present reminder, it changes form and structure in UI, so hopefully, people don't get desensitized to it.

The second thing is, PI itself will remind you after a 30-minute interaction, that it's been pretty long now, like, how do you feel about getting back to the real world, right? And it would gently ask you in a very non-judgmental and polite, respectful kind way to start thinking about the world around you. And the cool thing about it is that instead of just having a system notification, it actually comes from the source of the interaction

So I think both of those things are really important. But at the same time, people get fixated on the technical progress that we have at this moment in time and don't rush into the curve that we're on. We're on an unbelievable trajectory, it is truly magical. And even though everyone's played with ChatGPT and spent a year now talking about AI, I still feel like we're not fully internalizing what's actually happening.

So maybe just one possible intuition. Everyone's used an LLM of some sort, maybe for a real-world practical use case. The latest models are approximately human-level performance across a wide range of knowledge-based tasks, right. They can produce very high-quality content. Two generations ago, so GBT3, and GPT2, that was 100x less compute. So each generation has 10 times more computation. So it's significantly larger. And 100x ago, these models were completely incoherent, they produced totally in factual, completely made up not even a proper sentence, let alone factually accurate.

So you have to sort of try to extrapolate when we train the next two generations, what capabilities are going to emerge? Right? My bet is a couple of things. One, we're going to largely eliminate hallucinations. These factual inaccuracies are going to go from today, when they're hovering around sort of 80, 85%, all the way up to 90, 99 to 99.9%, right, in order of magnitude improvement over the next three years.

The second thing that I think in terms of capabilities that are gonna emerge is that at the moment, these are one shot question answer engines, you ask something that gives you an output, it's kind of like a one shot prediction that's accurate.

In the next two orders of magnitude, the model will be able to produce an entire string of accurate predictions in sequence, some of which are code, some of which are images, some of which is text. And that is going to produce what is much more like a project plan, or an entire analyst's briefing on a complex topic or a series of API calls to a whole range of different third-party services. And that's going to be truly transformational.

That's when the AI will be able to take actions on our behalf. And I think there'll be a whole different trust question in that environment.

Ben Thompson: One more question, and then prepare your questions, I will hand it off. But one aspect of an exponential curve is yes, we are thinking about sort of the steep part. It's steep now. And it's going to get steeper. But before the steep part, it's very flat for a very, very long time.

And I'm just curious, we were sort of talking before the panel, Dr Howard about like, what has it been like to sort of over the last last year in particular, for the last few years, suddenly, everyone is an AI expert. And it's sort of been like you're hacking away at this question for such a very long time. Like, what's it like to, is everyone welcome to the party? Or what does that feel like?

Ayanna Howard: Well in some aspects, you become the cool kid on the block. But the other aspects, it does worry me. And I think about, I'm going to turn the dial back. Back in the days when electricity was born. We actually had a lot of inventions that were like, "Oh, we can do this. We can do incandescent lamps." And, you know, light bulbs actually exploded in people's houses, like this was a thing. And then it was like, "Oh, but we have light. But yes, there's danger."

And it took a while before the rules and the regulations and UL and CE came about, and said, OK, if you are an inventor in this space, there are some rules of how you do this, you have to have some certification, you have to have some validation. We don't have that in AI. And so basically, anyone can participate and create and hook it up to a machine. And it's like, oh, this is good. And we have people who don't know what they're doing that are selling to consumers that also are trusting like, "Oh, well, this company has got a VC investor. Yeah, let's bring it in."

That's what worries me.

Ben Thompson: We do have our initial poll results in the past year, has your trust in technology increased or decreased, stay the same high. That's what I voted for personally. Decreased, however, on its heels, a 30%, increase, 70% and hard to say, 18. Do we have the second question by chance? We might have to wait to get to that one.

But it is interesting to consider what might be relative speaking for you two, has the advent of these large language models. And I'm going to limit it to the ChatGBT era because I think that's what is sort of a woke, you know, everyone else woke up to it, you two have been working on it for a very long time. From your perception on the inside, where would you answer this question? This is the organization one about your trust in technology.

Mustafa Suleyman: I'm one of the weirdos who have been working on this for 13 years whilst it was flat. And we were making like very modest progress and you had to get really overexcited about a very tiny, relatively tiny breakthrough, right? And so, to me, I trust them more and more like way more than I thought I would two years ago. And you look at the quality, I mean, clearly, talking to one of these models, like is magic, I mean provides you with access to knowledge and a fluent conversational style, and iterative back and forth.

I'm using it all the time to ask questions that I didn't even know I wanted to know about. Now, it's sort of because I've reduced the barrier to entry to access information in my mind, I've started to kind of like subconsciously condition myself to ask more things that I would have asked a year ago or two years ago because I subconsciously would have thought I'm gonna have to Google it, look through 10 blue links, go to the webpage. And it's like the bar is quite high to ask the question there now. So that's made me more trusting of it, actually because I can now ask more wide-ranging interesting questions.

Ben Thompson: More or less trust?

Ayanna Howard: Of technology?

Ben Thompson: Yes.

Ayanna Howard: I'm actually more trustful. And I would say that one because I can query it. I actually know what it's doing. And I'm actually more positive of what's in the blackbox because I know exactly what's going on.

Ben Thompson: Right.

Ayanna Howard: So more of it's because I'm not worried about it explaining itself. And I understand that developers that are part of this because they we've all grown up together. Organizations is a whole other question.

Ben Thompson: Are there any questions in the audience? Yes. We can start here in the corner.

Audience member 1: You know, what I still gotta understand, by the way I loved your book, but I still cannot understand it. Why is Silicon Valley so obsessed with AGI, especially big companies? I mean, if that same level of obsession could be maybe put on solving climate change, or, you know, getting manufacturing fully automated, so we could see that finally, the promise of productivity deliver? Wouldn't that be more useful?

Mustafa Suleyman: Yeah, it's a great question. And I hear and share some of your scepticism. But let me make the bull case for you. So that, you know, to answer your question. So look, I think what is amazing about us as a species and unique about us, as humans, is that we can absorb vast amounts of very abstract and very strange and contradictory pieces of information; we can digest it, process it, we can reason over it, we can generate new examples of that and imagine and be creative and inventive. And we use that to produce tools and that is the essence of civilization.

So the AGI quest, is really a game to capture the essence of that learning process and apply it to all of the big challenges that we face this century, from food to climate, to transportation, to education and health. I mean, that's really what's at the heart of the motivation, I think of a lot of people, there is a group of people who are just, you know, slightly evangelical kind of trans-humanists, who think that there's a natural evolution; I'm absolutely against that I'm very much a humanist.

And I think this is going to be a moment when we have to make some decisions as a civilization about what kinds of agents with what sort of capabilities and powers we allow in society, right?

Because very soon, in the next three to five years, there will be AIs that are really equivalent to digital people. I mean, they will have the smarts that we have, they'll be able to speak to you and I just as we're having this conversation right now and many people will ascribe self-consciousness or awareness to those models. And those models may sometimes say that they suffer. So many humans are going to feel very empathetic to those claims that this AI is suffering and therefore we shouldn't switch it off. Or therefore maybe we should give it certain rights and protections that humans have.

And I think that's going to if we just treat that as a philosophical debate, that's going to be very, very hard for the humanists to win. And I think that we should really draw a line in the sand and say, Look, you know, society today is for humans only. And that's how it should remain, just as we say this here that AI is shouldn't participate in elections.

Why do we say that sort of an arbitrary choice, at one level and AI could be really useful in an election, it could actually provide very factual information to everybody. But I think it's a principled decision to say, democracy for all of its flaws, is for humans, it should remain for humans, AI shouldn't campaign or participate or election year or persuade, and we just should deal with the consequences of human weaknesses in the electoral process for exactly this reason about the future of AGI.

Ayanna Howard: Actually, I'll answer I think I'm a little bit of a pessimist. My feeling is, I'm outside of Silicon Valley, but my feeling are two, one is that the organization that creates true AGI will control the world, unless we have some regulations and rules. So that's one. And so if you think about just the philosophy of creating is like first to the ship, gets on the boat and wins.

The second is that as humans we fundamentally want to create. An AGI represents that ability to not just create our physical but also create our mind. And so I think if you think about the human essence, it's just the natural evolution of we as people of creating something that grows that learns from us. But that is the next rendition of what we can do. And so those are kind of the two things that I've seen as philosopher.

Mustafa Suleyman: That's a great point.

Ben Thompson: One question I have, though, just to jump in, particularly with your background in robotics, it seems to me, is there a line between the digital world and the physical world, or one of the questions we have here is you already want to lose our jobs.

And I think particularly for anyone that is in a digital space, that is a very sort of pressing concern but at the same time, it seems to be from the outside. I'm no robotics expert. That's what we have here. Is there a much further runway between, say an AGI that operates in digital space versus one that operates in the physical world?

Ayanna Howard: I think in terms of the function, yes, but I will say the worlds are starting to converge quite rapidly, like amazingly rapidly. It used to be I was a roboticist and AI folks, those were software people that actually didn't know hardware. And they would say all your hardware, your robots are really stupid. And now we have a nice blend, especially with being able to connect to the cloud and actually learn almost in real time.

So at some point, it's just not going to be a digital persona, it is going to be a physical. And we don't necessarily have the skills or tools to really think about what that world looks like.

Ben Thompson: But there is the constraint though that you have to actually manufacture robots that it is not like digital information which endlessly duplicate or something along those lines. I mean, what's your perspective?

Mustafa Suleyman: Yeah, no, I think you're right, that is a constraint, which means the robotics is sort of going to remain behind for quite a while. But I think, to Ayanna's point, these fields are converging in a way that, you know, I guess, during this flat period of the exponential, they've been very separate, like, there's sort of been enemies, robotics does symbolic reasoning. And if this, then that rules, and AI is trying to build learning systems and now you're actually seeing them converge.

So you know, I agree with you servo motors and physical infrastructure is going to always be a constraint. And, you know, that's going back to what we were saying about building on the shoulders of all of the cloud, you know, investments that have been made over the last decade and all the devices we have all of that distribution infrastructure enables this rapid deployment of AI. I mean, I think it's going to be the fastest proliferating technology and the history of all technologies.

Ben Thompson: Let's take another question from. Yes. Right up here.

Audience member 2: We're one of the biggest risk takers in cyber insurance. And I think one of the reasons you trust AI is because you're good people trying to do good things. There's a lot of bad people that are going to try to do bad things. And legislation and rules and regulation aren't going to prevent them from doing it.

So my question is, what is it that you would like to build in? Or what are the risks that you see that we don't see in how to deploy this not for the benefit of society?

Mustafa Suleyman: That's a great question that, you know, the reality is that the more centralized the model, the easier it is for some number of regulators to provide oversight. If it's completely open, in five to 10 years time when these models are four generations ahead. It's clearly empowering everybody good or bad.

So that the fundamental debate in the community at the moment is, you know, there's absolutely no harm I think today being caused by open source and we should be accelerating and encouraging it. It's been the backbone of software development, for as long as software has been around.

At the same time, you know, there's a very legitimate question around when that point arises. I think that's the dichotomy. Even in the centralized providers, APIs and so on, there are still going to be a lot of hard questions about how to interrogate these models to limit their capabilities to restrict them. I mean, it's almost like the kind of social media moderation question all over again, but in even in an even more abstract way because these threats are described in time series data, rather than in language and words. So.

Ben Thompson: Is there an aspect of the solution is actually going to be other AIs?

Mustafa Suleyman: I worry when people say that because I think it's the silver bullet that people always say. And, you know, I think we shouldn't be a bit too complacent on that. Of course, you know, there are going to be AI systems that help and we already have, you know, all kinds of pattern matching systems that are detecting insurance fraud, credit fraud.

Ben Thompson: To the cybersecurity point that's sort of the game as it is today.

Mustafa Suleyman: Definitely, it is part of that but I think it is realistically, it's also about experimental deployment, like you have to put things into production, expose them to the world to really identify their flaws and weaknesses.

And I think that's actually been one of the great achievements of the AI community in the last 12 months that everybody's got a chance to play with these LLMs, poke holes in them, demonstrate their weaknesses, publish papers on them, try and deploy them in certain applications that fail, like that's the correct model. And I think that we certainly can't slow down on the deployment or integration side.

Ayanna Howard: I think in cyber, there is a move, for example, because you can't eradicate, like 100%. And so there is a move towards zero trust, like assume that you're going to get hacked, assume that there's bad actors. And so how do you design your processes your frameworks to deal with that?

In AI there's really no standard of saying, OK, let's just assume the AI is bad. How do we design our interactions such that, if we assume it's bad, what do we need to do on the human side? Or what do we need to do with the hardware? What do we need to do with XYZ? There isn't an equivalent movement I haven't heard of.

Mustafa Suleyman: I think you're right, there isn't. Although if you look at it, the standard that we expect from an AI system is much higher than what we'd expect from human performance, right? So in clinical care and diagnostics, we already have models that can detect all kinds of radiology questions or predict all kinds of acute kidney injury, sepsis, from real world data at human level performance but they don't get deployed because they have a much higher standard.

Same with self driving cars, right. The AI car has to be much more safe and reliable. So in a way, we do have, you know, built in scepticism of these models but we certainly don't have zero trust, like that's for sure.

Ben Thompson: I'll take a question from online. How can we trust technology if there are no policies behind it? Or if we can't trust the policymakers? To what extent? You know, I think it is easy for Google to reach to government reach to regulation, is that the answer? Or is there a better model or a companion approach to that sort of that which is itself its own centralization of a sort?

Ayanna Howard: So I wear two hats, because I actually dabble in policy, at least with respect to AI and regulations. And I think we'll take out the trust in policy, or policymakers, like, let's take that part out because it varies by country. But I think one of the things around policies or regulations is that it allows you to be starting off with equal footing of what is the expectation. And if companies or other governments violate that there is a ramification.

Now, some companies can pay, so it was like, oh well, whatever; but there's still some concept of a ramification if you violate a policy or regulation. Right now, we have so many things going on, you have things in the EU, you have things in the US, you have things in Japan that are like, sort of combined but they're all slightly different.

In the United States, we even have states that are very irregular, California just released one versus the federal, that is a problem in terms of trusting the policy because we don't have a uniform thinking process of what does it mean, when we talk about regulations or policies and use of AI for the good of humanity.

Mustafa Suleyman: Yeah, I think that on the policy side, the curious thing about the models is that as they get larger, they get easier to control. And the policy that drives their behaviour is also expressed in words instead of in code. Right? So that's quite a significant shift compared to the history of previous software that you're trying to consider.

Ben Thompson: Should all these companies, should they be exposing their full prompts just for transparency reasons?

Mustafa Suleyman: Yeah, I mean the prompt isn't the only way to control it. Like it's actually an entire process of learning from feedback and so on. But, you know, I think there's a pretty good case for that. I think the other thing is that you could actually look at the outputs, test the outputs because you can ask a whole battery of test questions and evaluation.

So I think what I'm starting to see is rather than a formal legislative approach today, a lot of the governments are getting behind new evaluations for bias and fairness or for you know, increasing the risk of biohazards, for example, coaching somebody to create a bomb or something like that.

Ben Thompson: Which to be fair, you could look up on the internet right now.

Mustafa Suleyman: Which you can look up on the internet.

Ben Thompson: Or your public Library.

Mustafa Suleyman: Yeah, exactly. So the question is, whether it makes that easier reduces the barrier to entry in some dramatic way. But the good news is, you can actually just stress test these models. And so there's going to be a battery of like automated questions or attacks on the models, which I think will help give people a lot of reassurance.

Ben Thompson: I'll take another question from the audience. Yep, right here in the centre.

Audience member 3: So I'm interested, when you assess trust in a human individual, we genuinely use a framework that's built around capability and character. So how people do things and why they do things or doing things and doing the right things. And then we take that we put it in context. Because trust is really only useful when it's contextual. So no offense but asking people if they trust organizations, to do what, right?

I'm really interested if we have found a framework yet to assess what trustworthiness means in AI and whether there is a danger in applying a human framework of behaviour onto AI?

Mustafa Suleyman: That's a great question. I don't know that there's a definitive framework but my mental model is the following. So, I consistently trust people that are aware of their own errors and weaknesses and lack of capability. And so uncertainty estimation is actually a critical skill of these models. So let's just imagine hypothetically, that we're always going to have this hallucinations issue, they're never going to get more factually accurate.

Well, other than trying to get them to be factually accurate, the other trick is to get them to know when they don't know. And to be able to communicate a confidence interval with every generation. Not quite sure on this, I'm a bit sceptical, I really don't know. And or I can't answer, right. And you could think of an entire spectrum of things on that – you know, scale and if it was consistently accurate with respect to its own prediction of its own accuracy – that's a kind of different way of solving the hallucinations problem.

Likewise, if it says, when I can't, you know, write you that email or generate an image because I can only do X, that's increasing your trust that it knows what it doesn't know. So I think that's one framework for you know, that isn't exactly how we treat humans.

At least we don't do that explicitly, we have a kind of subconscious clock ticking, that we may not be always fully aware of, you know, did this person do what they say what they were gonna do? Were they late? Did they not pay me back? Do they always say the wrong thing? They've said five different things in the last 10 minutes? That kind of thing? Like that's our ticker, I think, for trust.

Audience member 3: Can you bake humility in?

Mustafa Suleyman: So yeah, you can bake, because obviously, the model is on a spectrum, right? It's constantly producing the most likely next token, so you can adjust its confidence interval to make it more self doubtful if you talk to PI today, for example, on sensitive topics, like a conspiracy theory, or a breaking news story.

So PI has real time information you can talk to about, what happened at Davos yesterday or what's going on in Gaza. And if it's a sensitive topic that it's not sure about or if there's a lot of conflicting information, it will often reserve judgment. And by default, we've actually made it quite cautious. If it's sensitive, or if it's conflicting and that's a safer place to be, it's not as useful to the user if we were a bit more bold. But I think in the long term is the way that we try to build trust.

Ayanna Howard: So I will just say just on last note, there were 122 definitions of trust in technology. And so it is a moving target. I think he's correct. One of the movements is it's not just about capability but it's about capability in the interaction with the human. I think that's the one definition that is coming to, I would say, have more prominence in the whole trust technology area.

Ben Thompson: There's an aspect I think of this framing just sort of this discussion in general. And it ties into a bit of the regulation question of why would you not want regulation? Or why would you be wary of it? And I think the worry is sort of a classical one, which is what sort of benefits or capabilities become foreclosed that sort of never get developed?

There's a question here, you know, it's anonymous, I can call them and says: Why should we trust technology besides profit? How does it benefit humanity?

And I think that one thing that occurs to me I'd love to hear both your sort of take on this is, is there an aspect have technologists done an insufficient job talking about why this is a really big deal and people ought to be excited. And certainly we should be aware of and address the issues. But why is this so important?

Mustafa Suleyman: I feel like all we do as technologists is talk about the benefits. And you know, the people accuse us of sort of hype and stuff.

Ben Thompson: Has there been an over-correction perhaps on social media?

Mustafa Suleyman: Maybe there is I mean, look, I'm not sure that like that's necessarily the right framing, I mean, that these models are clearly delivering provable benefit. And to the extent that it's a useful tool. The way to distribute it in the market is to generate profit. Like profit is the engine of progress that has driven so much of our civilization. It doesn't mean that it shouldn't be, you know, restrained and regulated in some way.

Ben Thompson: People are paying with their wallets.

Mustafa Suleyman: They're making that choice with their wallet and I think it is important that we start talking about which capabilities are potentially in the future going to create more risk, right? And that's where I think we need to start thinking about regulation. So for example, autonomy will clearly create more risk in the world. If an agent has the ability to take a whole series of actions, independent of human oversight, unquestionable that creates more risk.

Ben Thompson: Or the crossover to the physical world.

Mustafa Suleyman: Yeah, it interacts with the physical world, there's clearly more risk there. Right? No question. If it's a high stakes environment, like healthcare or self driving, clearly creates more risk. If the model has the ability to self improve, without human oversight or human in the loop is kind of a version of autonomy. But if it can adapt its, you know, weights, learn new information, change how it operates without human in the loop, clearly has more risk there. Like no question.

If it is inherently general, right? Clearly, there's more risk, if it's trying to be good at absolutely everything simultaneously, it's going to be more powerful. So, whereas the flipside of all those things, if it's more narrow if there's more of a human in the loop, if the stakes are lower, so there's clearly a framework where you know, regulation is going to have to intervene at some point in the next few years.

Ayanna Howard: So this is the the story of technology in general. I remember back when laptops and internet came in, it was like, "Oh, my gosh, the world is gonna get destroyed and there's going to be the haves and have nots. "And I would say the internet has really levelled out and made the world a little more equal. In some aspects.

If you look at Africa, when they went from landlines, they just over like leap over to now have cell phones and actually have connections in terms of communities. So technology, I believe, is always moving forward. I think the problem is, as technologists, we are trained to be social scientists or historians. And we traditionally, we're positive because it's our field. It's like, we're in this field because we love it.

And then someone's like, "yeah, but it's bad." No, no, no but it's perfectly fine, like it allows all of these opportunities. And I think that is one of the things that we do really bad as technologists, we don't necessarily build bridges with others that can translate what we see as the positives. And we know some of the negatives as well.

But why are we going to get rid of our own jobs, that's not going to ever happen? And so I think this is a room for improvement, is how do we, as technologists, build bridges with others understand that, understand the technology, they can also translate that in terms of the risks, as well as the opportunity to improve the space.

Ben Thompson: We have time for one final question. So we'll take it from the audience if there's any takers. The table is yours.

Audience member 4: Would love to hear your take on the future of LLMs. I mean, is the size game gonna continue? And how long? What does the future look like? If time permits, I would also love to get your take on this, the open versus closed debate that's going on right now.

Mustafa Suleyman: I think the short version is that the models are being evaluated against a threshold of human performance. And that's a fixed threshold. Like how knowledgeable I am, how creative I am, how empathetic I am. And yet, the models are pushing through this curve over time, right. So that's one trajectory bigger, happens to be better.

But as with all inventions, once we achieve a certain state of capability, there's huge pressure to fix the same performance threshold and make it much, much, much smaller. So you know, today, you can train a GPT3 level capability model, let me just get this right, at 60 times smaller, in terms of flops, than the original 175 billion parametre model, which crudely means 60 times cheaper to serve, anytime you ask a question of it, it's you're paying that much less in computation.

That's a phenomenal trajectory because you're getting performance increases from scale and efficiency gains, as they get smaller. So that's definitely going to continue, which is good for the small ecosystem and open source and startups. And obviously, it's good for the absolute peak premium deliverables as well.

Ayanna Howard: Question is, what is the next technology?

Ben Thompson: Well, I'm curious, anyone who's worked in AI or robotics for as many years as you have, by definition, the LLM is the new kid on the block.

Ayanna Howard: Clearly.

Ben Thompson: So is it sort of going to be one piece of many pieces? Or is this sort of the end state?

Ayanna Howard: No, it'll be one piece of many pieces. As you know, symbolic AI is coming as it is, believe it or not, so it'll be one piece of many I think right now. It's very efficient. It's very effective. We have to drive down the energy costs. I think it's ruining our planet a little bit right now. But at some point, we will get that but it won't achieve what we really want.

It won't achieve AGI necessarily it won't achieve XYZ. And there'll be some other new shiny thing is like oh, if we add an element n plus x or generative AI in general plus something else, it'll make us leap to the next one. It's accelebrating.

Mustafa Suleyman: I don't know, I think betting against an LLM is kind of like betting against Elon Musk like you don't really get how it's happening and you don't want to believe it but it's like, Jesus I wouldn't take the other side of that bet.

Ayanna Howard: We couldn't go to Mars at one point and we will one day go.

Ben Thompson: It's all full circle for you. So I don't know if we have increase or decrease in trust in technology. Hopefully we have increased your trust in the members of our panel here. Thank you Mustafa. Thank you Dr Howard. And thank you everyone for your questions both in the room and online.

Mustafa Suleyman: Thank you Ben. Thanks everyone.