Artificial Intelligence in 2025
Panelists discuss the key findings of the 2025 AI Index Report, including open vs. closed-source AI developments, policy investments, and the evolving race for AI dominance.
WALD: OK. Good evening, everyone. I’m Russell Wald. I am the executive director of the Stanford Institute for Human-Centered Artificial Intelligence. And it’s always wonderful to be back here at the Council. I am not on the East Coast enough, so it’s a delight to be here. And thank you to the Council on Foreign Relations for this partnership and hosting us, and to Ambassador Froman for helping facilitate this.
At the Stanford Institute for Human-Centered AI—HAI, our acronym—we publish annually our annual index—AI index. And what that is, is a measurement of artificial intelligence and the progress of it, and what is happening on this.
So we’re in our eighth year publishing this particular report, and it has spanned quite significantly over the last few years. And so right all the way down at the very bottom you can see here the—first one, where we are, on our eighth edition; then 2025.
It’s published and—by a passionate multistakeholder group, and it includes people from academia and industry. And quite a few of these also have prior government experience that participate in this.
We have numerous collaborators related to this particular report: GitHub, Accenture, McKinsey, LinkedIn. They all help provide data to help inform this particular report.
It’s also widely covered in the press. Just this week alone, we engaged with press coverage from Wired to Politico to Fortune to Fox News, and that’s just a fraction of the ones that I’m naming right now.
The index has actually also been extensively cited worldwide in government reports and other global bodies such as the U.N., the IMF, NATO, the OECD, the World Bank. And numerous other countries are regularly publishing this report and seeing this as a tool for them to use.
We also take time to brief industry, and so this is a tool utilized by industry. And they are active participants in consuming it.
So the index is nearly 500 pages, and we could not remotely cover it all here today. So what the plan is, is I’m going to share with you some highlights of the index and some key parts that we take from it. We have chapters that range from technical performance, research and development, the economy, education. And that—and within each chapter there’s going to be highlights that you can access. So I encourage everyone to utilize the report. It is quite useful in many respects, and you can extrapolate a lot of data and insight from it. It being 500 pages should not feel that overwhelming with all of these other key parts to it. After we’re done here, though, what we will have is a discussion by some of my colleagues and CFR leaders to have a—talk about what is in the report and AI in general.
So let’s start talking about some of those takeaways, and let’s start talking about technical performance here. And we can see image generation as a means to be able to do this. So this is mid-journey, a(n) image—text-to-image generation model, and what it was prompted with, going back all the way to 2022, was to create a hyper-realistic image of Harry Potter. You see the image all the way on the left, compared to where we are today. It has improved significantly. And so that image on the left is one of the first ones from 2022, and if you look at that it resembles more of a Picasso painting than anything that we would expect. But now, just—after just a few years, this dramatic swing up, we have a picture on the right where it looks like Daniel Radcliffe, the actor playing Harry Potter. Never mind that there’s IP questions that already come from that alone just with that particular one.
And why do we see this? Well, we see AI performance on demanding benchmarks continuing to improve consistently. In 2023, researchers introduced new benchmarks—MMMU, GPQA, SWE-bench. And I’m not going to get into every key part of this, but these all measure different capabilities of the models, whether that’s from reasoning, whether that is coding, whether that’s just language capability. And what we see is just a year after—later after these performance benchmarks were introduced, they’ve sharply increased and scores rose 18.8 percent, 48.9 percent, and 67.3 percentage points on MMMU, GPQA, and SWE-bench, respectively. And so these—beyond the benchmarks, these AI systems made major strides in generating high-quality video, and in some settings language model agents even outperformed humans in programming task with limited—with limited time budgets. The bottom line, we continually push the benchmark boundaries and we see relatively—saturation fairly quickly related to these.
We also see that AI is being embedded every day into our daily lives. So, for example, from health care to transportation, AI is rapidly moving from the lab to daily life. In 2023, the FDA approved 223 AI-enabled medical devices, and that’s just up six from 2015.
On roads, self-driving cars are no longer experimental. Waymo, one of the largest U.S. operators, provides over 150,000 autonomous rides each week, while Baidu’s affordable Apollo Go robotaxi fleet now serves numerous cities across China. And in fact, Waymo, a subsidiary of Alphabet, began deploying its robotaxis in Phoenix in early 2022 and expanded to San Francisco in 2024. The company has since emerged as one of the most successful players in the self-driving industry. And as of January 2025, Waymo operates in four major U.S. cities: Phoenix, San Francisco, L.A., and Austin. And data sourced from 2024 suggests that across the four cities the company provides 150,000 paid rides per week, covering over a million miles. Looking ahead, Waymo plans to test its vehicles in ten additional cities, including Las Vegas, San Diego, and Miami.
The main point of these two sides is just to reiterate AI is entering our daily lives in a way that we may either physically see or we don’t—or don’t become necessarily apparent to us immediately.
In 2024, U.S. private investment grew to 109.1 billion (dollars), nearly twelve times that of China at 9.3 billion (dollars) and twenty-four times that of the U.K. at 4.5 billion (dollars). Generative AI saw particularly strong momentum, attracting 33.9 billion (dollars) globally in private investment and an 18.7 percent increase from 2023.
AI business adoption is also accelerating. Seventy-eight percent of organizations reported using AI in 2024, up from 55 percent the year before. Meanwhile, a growing body of research confirms that AI boosts productivity in most cases and helps narrow skills gap across the workforce.
And the U.S. still leads in producing top AI models. But China is closing the performance gap, and on the next slide it’ll really be much more stark. But in 2024, U.S.-based institutions produced forty notable AI models, significantly outpacing China’s fifteen and Europe’s three. While the U.S. maintains the lead in quantity, Chinese models have rapidly closed the quality gap. And performance differences on major benchmarks such as MMLU and HumanEval shrank from double digits in 2023 to near parity in 2024. I will note that, meanwhile, AI publications and patents at the same time in China continues to have the lead.
But what’s most profound is how much—this particular slide, as I noted, how China has shrunk the gap with these particular models compared to the U.S. This particular graph is telling where you can see the delta shrink substantially over the last year. And what I would say to everyone in this room is, is if these conditions and trends continued at this pace and I would be here next year presenting this, then most likely China would have surpassed U.S. performance at that point.
The responsible AI ecosystem also seems to be evolving, but unevenly. AI-related incidents, as one would expect, the more ubiquity of it would sharply increase. Yet, standardized responsible AI evaluations remain rare among major industrial model developers. However, new benchmarks like HELM—the Holistic Evaluation Language Models—AIR-Bench, and FACTS offer promising tools for assessing factuality and safety. Among companies, a gap persists between recognizing responsible AI risk and taking meaningful action.
In contrast, governments are showing increased urgency. In 2024 global cooperation on AI governance intensified, with organizations including the OECD, EU, U.N., and African Union releasing frameworks focused on transparency and trustworthiness, and other core principles—AI principles.
This does not mean AI developers neglect safety testing. Many conduct evaluations. But much like models—much like—much like most models are kept proprietary, these evaluations are often internal and not standardized, making assessments and comparisons of models difficult. External evaluators also present challenges. For example, third-party evaluators like Gryphon, Apollo Research, and METR assess only select models and their findings cannot be widely validated by the broader AI community.
OK. Now we’re going to change a little bit and talk about public opinion and how the public views this. And it’s a mixed perspective. In countries like China, 83 percent; Indonesia, 80 percent; and Thailand, 77 percent, strong majorities see AI products and services as more beneficial than harmful. In contrast, optimism remains far lower in places like Canada at 40 percent, the United States at 39 percent, and the Netherlands at 36 percent. One thing to note, though, is that sentiment is shifting. Since 2022, optimism has grown in several previously skeptical countries including Germany with an increase of 10 percent, France with an increase of 10 percent, Canada with an increase of 8 percent, Great Britain 8 percent as well, and the United States at 4 percent.
Another key point to take away from this is that driven by increasingly capable models the inference cost—what the inference cost is, just to let people know, there’s a difference between training and inference. And the inference cost is, for example, querying the system like you would with ChatGPT. That has gone down substantially. And so for systems performing at the level of GPT-3.5, dropped over 280-fold between November 2022 and October 2024. Hardware costs have declined by 30 percent annually, while energy efficiency has improved by 40 percent each year. Open-weight models are also closing the gap with closed models, reducing the performance difference from 8 percent to 1.7 percent. And that’s fairly significant when we talk about some of these particular models, whether that’s for example DeepSeek one of the open-weight models produced by China, and how fast this open-weight area has closed the gap from the proprietary models. Together, these trends are rapidly lowering the barrier to advanced AI.
OK. And a quick point on, now, the policy and governance side of the technology. In 2024, U.S. federal agencies introduced fifty-nine related regulations, more than double the number in 2023, and issued by—and issued by twice as many agencies. Globally, legislative mentions of AI rose 21.3 percent across seventy-five countries, continuing a ninefold increase since 2016.
Alongside rising attention, governments are investing also at scale. So it’s not just about regulatory side; it’s also there’s an investment story. And on the investment side, Canada has pledged 2.4 billion (dollars), China launched a $47.5 billion semiconductor fund, France committed 109 billion euro, and India pledged 1.25 billion (dollars), and Saudi Arabia’s Project Transcendence represents a hundred-billion-dollar initiative.
And one key part that’s very much at home to us in academia is, notably, 90 percent of AI models came from industry, which is up 60 percent since 2023. And so while academia remains the top source of highly-cited research, model scale continues to grow rapidly. Training compute doubles every five months, datasets every eight, and power use—and power use increases annually.
Yeah, performance gaps are shrinking. The score difference between the top and tenth models fell from 11.9 percent to 5.4 percent in a year, and the top two are now separated by just .7 percent. The frontier is increasingly competitive and it is crowded.
We’ll also note why. Why is this happening that industry has such a lead here? Well, the AI Index estimates validate suspicions that in recent years model training costs have significantly increased. For example, in 2017 the transformer model, which is the T in GPT—it’s the architectural area for this—it cost under $700 to train that. If you fast-forward five years after that, Google’s Gemini Ultra cost nearly $200 million. So that should really take a moment to think about that. We went from nearly $700 to almost 200 million (dollars) in training. So what does that mean for the future of industry and academia in this space?
We also see a tight race on performance at the frontiers. Notably, you’ll see how DeepSeek has a rapid rise in this past year. I think many in the geopolitical parts of the Council will find that an interesting part. But there is no true leader. What you see is a constant up and a race between numerous model providers. And I think that this will be a conversation that we’ll be having for many years to come.
So, with that, you can access the AI Index itself from our QR code here or you can go to HAI.Stanford.edu and get a copy of the index, for those who are on Zoom and may not be able to have immediate ability to get the QR code. It is, again, a 500-page report, but there are numerous pieces within the report that will be beneficial, and it’s easy access for everyone to utilize.
And with that, we’ll turn it over to our conversation with our panelists. Thank you. (Applause.)
VAITHEESWARAN: Thank you so much, Russell. I’m Vijay Vaitheeswaran. I’m a senior editor at the Economist. And I welcome my panelists to join us to discuss this fascinating topic.
I’m delighted to have with me onstage, to my very far right, Sebastian Elbaum, technologist in residence here at the Council on Foreign Relations; Yolanda Gil, research professor of computer science and spatial sciences and principal scientist at the University of Southern California; and James Landay, co-director of the Stanford Institute for Human-Centered Artificial Intelligence as well as a professor of computer science at Stanford University. So thank you for joining us today.
Let me start, perhaps, with you, Sebastian. We saw—there are so many things that we could start with, but certainly the intensifying competition at the frontier, both amongst the companies, the models, but between U.S. and China. I know that will be of great interest to our members at the Council on Foreign Relations. Can you kick us off with a little discussion about what is the implication of China catching up as quickly as it seems to be doing? And we all know the DeepSeek moment; it took trillions off of market caps in tech stocks. But going beyond that moment in time why is this happening? What can we observe about the nature of global innovation, about competing models—closed versus open, corporate versus government-sponsored? Give us a few reflections. And I may come to my other panelists to join us on that as well.
ELBAUM: Yeah. Well, first, let me say I enjoyed reading the report. It was actually—it’s just refreshing seeing factual, transparent, replicated study on this—on this area.
Regarding your question, Vijay, I think the—you know, Russell illustrated really well something that we have been seeing for several months, this shrinking window of advantage that the U.S. had over the competitors, particularly with China. This trend that we are observing is that the window is shrinking, and particularly on the performance of these frontier models. It doesn’t matter how many—we in academia can generate whatever benchmarks we want. And it seems like the models—every time we come up with a benchmark, a model—the models will rise to overcome that benchmark and to saturate it fairly quickly. So we’re seeing these advances on—you know, from all the model makers. And I think one thing that we are observing more and more is that the race is tighter and it’s definitely global now.
To me, one of the things that has concerned me is the fact that, you know, U.S. has had the dominance—clearly, you know, it has a good velocity, but China has had more acceleration. And it’s clearly shown by the graphs that we have seen.
The other thing that raised questions in my mind when reading the report was that, look, this is about dominance, but what is it that we mean by dominance? You know, on one hand we’re thinking of the frontier models and their performance over benchmarks that are emerging, and how sophisticated these models are becoming at solving really, really hard problems. And in that sense, differences of 1 percentage in my field, and you know, programmers, the—programming challenges, these models are pretty much fairly close to each other on the latest benchmarks.
But the other aspect of this when you think about dominance, you think about the technology and its diffusion, its adoption like others. And this is another type of dominance, and that’s something where also China has made really good progress, particularly by making some of the models open—open-weight. Open-weight basically enables other people not just to be a user—an external user of the model, but it enables other companies to build on top of those models. Now, open doesn’t mean that they’re transparent and you know exactly what is in the model, but it enables people to build on it. And that means that when these models are distributed many people can adopt and then build on them quicker.
VAITHEESWARAN: Just to press on that a moment, Sebastian—I know you have a little more to say, but since you made such a provocative point—what is the geopolitical advantage China might gain from having its models adopted in this way? What is the reason to do that, in your judgment?
ELBAUM: Well, I think by making a model open you can accelerate diffusion. That means that other people are using it. That means that you have markets that you can access. That means that you can influence how people are going to be employing those models because they’re building on top of the frontier model that you provide. So you provide these foundational models and people build on it.
VAITHEESWARAN: Presumably, the data involved as well is a strategic advantage.
ELBAUM: Well, I think—I think the—I’m not sure about the data. It depends how you—they control it. But it’s just—it’s just if you’re giving them the building blocks where they’re going to build their own systems, you’re owning that building block. And everything that builds on it is going to depend on that. So it is—it is about the usage of a technology can quickly—how quickly can you get to distribute this technology around the world. So I think that’s extremely advantageous from an influence perspective, geopolitical influence.
But I think, going back to the dominance question—and I think this is key—is dominance has various dimensions. At least two of them are: Can we dominate on the frontier models? Well, right now that’s doubtful. The window is tightening. But the other thing, though, is that we may not see that window getting much smaller, because actually being the leader and pushing the technology is much harder than being a follower. So I think those lines are going to be very close.
But the other one is, really, dominance on the diffusion of the technology. Who is going to have their models distributed across the world providing services, providing the inference engines that other people are going to use? I think that’s a—that’s a critical part of dominance that we are going to—we’re going to see. And you know, I think China has done a great move by going open with these models and by—given the structure of the government that they have, they can actually determine or have a much more direct influence on how those models are adopted within their government.
VAITHEESWARAN: But the latter point you make, there is—in the history of innovation, it’s been shown that it’s not necessarily the company or even a country that invents a new technology, but the ones that adopt and adapt it, find value from new technologies that gain the most value. Like, the distinction between invention and innovation is the value creation. And so it may well be good for the world to have that wider adoption, but we’ll have to see how things roll out.
Let me ask, Yolanda, if you could follow up on that and give us your perspective on this question of geopolitical rivalry, a tightening competition, and maybe the open versus closed question.
GIL: So I’ve been doing research in AI for four decades, and for a long, long time the field looked to me like a pond. Every now and then there was rain coming up and a lot of upheaval. Sometimes the rivers came in and just brought more water into it, and then the water would go, and it was pretty much stable. But I think the technology is getting so much investment, so much attention, so much enthusiasm, so many benefits that AI is really like a surfing beach. There’s the next big wave, and we just don’t know how many feet tall it’s going to be. And so we need to—
VAITHEESWARAN: Oh, you really do live in Southern California, don’t you, Yolanda? (Laughter.) We’re all enviously looking at your life for forty years, AI.
GIL: We need to—we need to be looking for the next wave, and we need to be inventing the next wave and the 200 waves that come after that. And I think that’s what the U.S. really excels at. It’s not about containing what we have today. I think that’s one conversation. But I would like to bring to our conversation, you know, what’s happening next and what’s coming next. We don’t know what it is, so we have to invest in a lot of different ideas. And as we all know, deep learning was not a big concentration area or a big investment area for many years. So I think we need to keep looking forward to the next big thing and how do we make it happen first.
VAITHEESWARAN: Well, just tell us a little bit more about that. You’ve primed us to think about the future waves of technological innovation, not to get caught with the charts we just saw a second ago. And as—with that experience that you have in this, what should the world be doing—what should the U.S. be doing—to keep itself primed and ready for the next waves of innovation? Does it lie with universities, for example? Or is it about the talent pool? Can you give us a couple of ideas on what you have learned over the years? Because AI has had its winters, of course, right? We’ve seen this story over and over again. We happen to be at a moment of tremendous enthusiasm and flow of resources. What do we need to do to sustain that innovation momentum?
GIL: If you’re talking in terms of national security, we have a National Security Commission on AI that already two years ago had a report that recommended billion-dollar investments in the technology in this area.
VAITHEESWARAN: Really, that’s small change for these companies.
GIL: That is small change for these companies. But I think also we look at industry and we see these models coming out of industry, but what I see is the faces of my students—and what James sees is all of his colleagues and students that have these revolving doors with universities, and easy access to our labs and new ideas. So I think the role that universities play in all of that innovation sometimes is not very explicit and quantifiable. We strive in the AI Index report to point to data, and sometimes it’s really hard to quantify how much of the invention in industry is really coming from education—that these students have preparation, that these students have capabilities that they’ve thought about or envisioned while they were in our labs in academia. So I think that we have this very healthy ecosystem between industry and academia, and we really want to keep it that way—as a healthy, collaborative ecosystem.
VAITHEESWARAN: So she’s teed you up perfectly to tell us about this—the ivory tower and how important it is. Is all well in the academic world of innovation, or are there some dark clouds?
LANDAY: Well, you know, what week is it?
So Yolanda, you know, highlighted the importance of academia. And in fact, she just foreshadowed one of my ideas for the report next year, which is to actually try to tie these innovations in academia directly to the commercial applications because that’s where these ideas have come from. Every single thing that you’ve seen actually has its roots only a few years before in academic research. And if we want to come up with the next set of big ideas, which we actually need, you’re not going to just keep seeing lines that go up and to the right on the capabilities of these models because there are fundamental limitations in these architectures that are—that are not going to be able to do certain things that you’ve already experienced yourselves when you play with them, whether it’s their memory, forgetting what you told it before; or hallucinations, making mistakes. These are fundamental to the algorithms.
And if we want to actually continue to improve intelligence systems, we’re going to need new ideas. And they’re going to come from academia, but only if academia is resourced appropriately to do this. One is the computation needed. Two is even the talent. So if we look at who is in a lot of our academic institutions, a lot of these students are foreign-born. Many of them are maybe not thinking it’s such a great idea to come to the United States if they’re going to be snatched off the street or vilified. You know, countries like China we’ll see many of the graduate students at many of these top programs, as well as the next hundred down the line, are composed by many students from that time. Even in the time when Yolanda and I were graduate students at Carnegie Mellon a lot of the other students came from India and China, and that’s continued.
So we need to, one, make sure that we have the students who are going to be producing this. And you know, if we think of it in the geopolitical realm of China, that’s the cheapest weapon you could ever buy, is bringing the brightest minds to the United States, having them do great work here, and wanting to stay here working for us instead of using it elsewhere. It’s super inexpensive. But we should not throw that away so quickly. So we need to make sure we have the students.
And then we need academia to have the resources to actually work on these large models that, as you saw in these graphs, are getting very expensive. But if we really want the next ideas, you actually need to have that scale for the top researchers in computing to work on these things.
VAITHEESWARAN: So that’s good. That gives a couple of policy prescriptions that we should think about.
Another one where we have heard policy, at least, prognostications and pronouncements made is in the infrastructure needed for the AI—that is, the datacenters. The energy question, which is often wrapped around the question about the future of AI—how much energy will this use? How much of that will be fossil fuel versus clean? I wondered if any of you had a thought on how—what is the right way to think about this, how to dimension this challenge? How much of it is either overhyped or overblown versus a legitimate challenge in scaling up. Especially take onboard the counterpoint that was made, through sort of frugal engineering and cheaper, more cheerful ways of doing AI. Does it have to be this massive set of datacenters half the size of Manhattan? I think Mark Zuckerberg put out a tweet with this map showing half of Manhattan covered in his one new datacenter. Is that the only path forward, or are there alternative ways that may be less resource-intensive?
ELBAUM: Let me—let me use a fact other than Zuckerberg’s tweets for this answer. But you know, when you look at the way the models are built right now, they’re getting more powerful through increases in—increases in the number of parameters, in the size of the model, and in the data that they consume. I think the report has a, you know, hundred growth in computing transfer—cost of, basically, doing the training from ’21 to ’24. A hundred times more. And you can see the exponential growth in the size of these models.
Is that sustainable? I don’t think so. I think we cannot keep having that exponential growth just because, you know, supporting that energy-wise when it takes much longer to actually build a plant to provide the energy than what we’re going to need with that curve. So I don’t think it’s practical. I think what we’re—what we’re going to likely see is that we’re not going to build big models and then distill them into little ones that people can use. I think we’re—what we’re going to likely see is from, you know, research labs coming up with solutions like new architectures to support them, new ways for these models to learn continuously instead of training a model from scratch or fine-tuning it every time. So we’re going to see, I think, research coming out that is going to allow us to de-scale this growth of models because it doesn’t seem sustainable. And it also follows patterns of innovation, right, that you have these breakthroughs that basically realign this growth that you see right now.
VAITHEESWARAN: There is a storyline, right, that one hears about why so much money is being thrown at a handful of companies, that there’s a winner-takes-all race going on—so goes the argument—to get towards, you know, a generalized sort of AGI, and one, maybe two companies will win; the rest will—you know, will lose the game. And so goes the story, therefore, you have to double down on the leaders. And this is sort of a money-spinning machine that’s going on at the moment in Silicon Valley and maybe parts of the Middle East with sovereign funds funding these companies. What’s right in that, or what’s, you know, correct in that, or what’s wrong with that argument?
LANDAY: Well, first of all, you can say it’s a race, but there’s no race with a finish line where, oh, you crossed first, now you control it. So there’s going to just be continual improvement in the technology until people are similarly stuck waiting for the next big thing that is going to push us as a field forward.
So, as you saw in these graphs, there’s a lot of closeness among a lot of these models, some from the biggest companies but some from a small Chinese company, right, and some from open source. So I don’t think the race metaphor is actually the right metaphor, nor is this winner takes all the right metaphor.
And on the energy question, even that—yes, training the models takes a lot. There’s going to be improvements on this because you need it. But the real energy use is that inference time, because the more that AI becomes useful in our everyday lives and all of our work we’re going to be using the models. And that’s going to way be larger than the training costs. And so that’s actually where you’ve seen a lot of the optimization, and it will continue to be optimized because it has to be done there.
So the energy costs growing because of model size is probably not what the real energy costs will be. It will be needed for these datacenters, but mainly because we’re all going to be using AI in all of our work lives, our home lives, and everywhere else. But the cost per use is going to go down a lot.
GIL: Yeah—
VAITHEESWARAN: On that point, if you could pick up on that point about inference and how we use AI. And in particular I’d like to hear from all of you, starting with Yolanda, you know, the making of AI, we’ve talked a lot about that’s where a lot of attention is, but the actual use of AI and the potential it has, including to do good—there’s concerns as well of how AI can be used—but fundamentally, you know, AI can be used as a tool for efficiency, for improvement, whether it’s in environmental modeling or science. Can you talk a little bit about some of those benefits, whether it might be worth the price paid to make the AI, even if it’s a little energy-intensive, because of the potential gains that we can already see happening or the direction of change?
GIL: You’re certainly making the argument. (Laughs.) But—
VAITHEESWARAN: Well, I mean, I see—look at the two Nobel Prizes that were awarded, and so I have some evidence on my side.
GIL: Yes. I think we have a history in computer science of, you know, doing a first amazing new thing and then, over time, that thing improves and we understand better how it works, and we develop new algorithms and alternatives, and different ways to look at the problem and understand it, both theoretically, experimentally, practically, and reinvent whatever it is that we were working on. So I expect a lot of that to happen.
And I feel a little bit that we’re in a place where, you know, the very first cars that were made, right, everybody had all of these tools in the back, and you wouldn’t drive a car if you didn’t have some sense of how to fix the engine. And so I feel that we’re all trying to put all the tools and imagine, you know, oh, you’re going to need all of these wrenches to continue to drive the car. Today, we wouldn’t even think about carrying tools in the back of our cars because the technology’s so different. And we’ll get there eventually. I think there’s a path to that. And I think it will affect the capabilities of the systems, but also how fundamentally and internally these AI models behave. It will affect the energy consumption. It will affect every aspect of how we build models today. So I expect a lot of big waves.
VAITHEESWARAN: Anyone else?
LANDAY: Yeah, I would just say this is not a one- or two-year transformation. This is going to go on over five to fifteen years, that things are going to change more quickly in certain industries—tech, for example, and programmers. Slower in other industries that maybe have more regulation or there are other barriers, like education and health. But it’s going to be—when you look back in ten or fifteen years, you’re going to see how different things are. And so we often overestimate how fast things happen in the short term and its impact, and then we similarly underestimate the impact in the long term. So it’s going to be a big transformation. And it will be in things that are good for us—our health, our education, the environment. But it won’t happen by itself. We actually have to be focused on the changes we want.
So we can’t say things like, oh, it’s going to democratize education. It won’t happen by itself. The rich and the educated will get these technologies first, unless we are focused on problems that we, as a society, feel we want to be solved. And we have to solve it in a human-centered way. That’s why we founded an institute at Stanford six years ago. That wasn’t an AI Institute. It was a Human-Centered AI Institute, which means we need to actually look at who the people who are affected by the technology, both as users, but also the broader communities that are affected, where decisions are made about them. And then, if this technology is successful, broader society, because it can have a negative impact. And so we actually have to consider those three levels together when we’re designing the technology, if we want it to be good.
VAITHEESWARAN: I’m glad you brought it there, because I wanted to get to this point. That is, you know, of course, your center is called the Human-Centered AI Center—or, Institute. But we heard public opinion data that was presented. Attitudes are shifting around the world. They’re in flux, but broadly speaking there seems to be a lot of positivity about AI these days. There might not have been some years ago. Any of you who wants to weigh in on this, help us understand. What is the human dimension? How are people receiving AI? Now, it’s not just watching, you know, Terminator movies, or it’s not something conceptual or theoretical. Many people encounter AI both in their own workplaces but also on their own. We ran some data in the Economist fairly recently suggesting that even if corporate bosses are a little bit reluctant to authorize the use of AI, many employees are using AI on the side, sometimes to do work and not tell the boss that the AI did it. But whatever the reason is, we’re seeing an interesting experimentation in that interaction of humans and AI. What are we learning about this interaction, if anybody wants to jump in on that?
GIL: One thing I will say is that I noticed that people at large are much more informed about AI, much more familiar and directly exposed to it than I could see three years ago, two years ago, even six months ago. So that’s a really good trend to see, that people are no longer talking about what they read in the news but about their own experiences and their own ways to connect with AI. I think that’s really remarkable.
There’s some results in the AI Index, some data, that shows that the majority of people believe that they will keep their job in the next five years, even if AI is coming and introducing changes in their jobs. But the majority of people believe that they’ll keep their job. And to me, this is a very informed way to look at AI, right? I am no longer fearing and thinking about job displacement. I am really believing that I will have still a lot to contribute to the kind of function that I do. So I’m very hopeful that, as we move forward, you know, a better understanding of AI will lead to more informed attitudes towards it.
VAITHEESWARAN: I think—go ahead, James.
LANDAY: Well, I was just going to say, the attitudes that we saw in that graph really are different depending where in the world and also culturally. So, you know, in countries where, let’s say, healthcare is a very scarce resource, people don’t care if your AI doctor is 98 percent accurate, if it’s 80 percent accurate. That’s better than nothing. And so that would be a very different attitude, versus bringing AI into the United States, where people would be like 80 percent? I don’t want an 80 percent doctor. And then also, attitudes towards privacy of data is very different, let’s say, in East Asia than in North America and Western Europe. And so that’s why you’ll see, you know, much lower levels of is AI good or bad in these different regions of the world. So these are issues that we need to think about.
VAITHEESWARAN: The cultural values that exist.
LANDAY: Yeah, cultural values. And we didn’t really even talk about that with DeepSeek. These models actually embed cultural values, almost ontologies, ways of what it means to be. And so in some ways other parts of the world see Western models as almost an imperialistic act of putting a culture. And similarly, China may see that, hey, one advantage of people using DeepSeek is cultural values are embedded, let alone facts. You know, what happened in early June one year on Tiananmen Square may, you know, be different. So these are questions that really have a need to be asked. And you can’t even ask these questions without the openness of not just open weights, but open data, to understand what went into them.
So these attitudes are going to change as people use these in their work over time. But people may not be scared about losing their job, but just like with globalization—which happened over thirty-five years—most of us didn’t lose our jobs. Most of us actually increased wealth because our GDP grew. But say that to somebody who is a steelworker in Pittsburgh or an autoworker and in Cleveland who lost their job and their family’s wealth. So there may be losers in this, even if most of us are better off. And so we, as people who are shaping public opinion and shaping government opinion, need to pay attention to how do we take care of that?
VAITHEESWARAN: A quick thought before we go to the questions.
ELBAUM: Yeah. I think that the—how the technology is adopted—you know, you’re talking about five, fifteen years. I think a bellwether of how things are going to turn out is how technology is adopted on actually technology companies to be—either replace or augment their developers. I think that’s going to be something to look at, because I think that’s going to be the first test that we have about what’s the impact of technology on the job market and on the careers that people have or are selecting.
VAITHEESWARAN: How it hits our pocketbooks. It’s a salient point to stop. This is the time we will turn to our Q&A from members, both here in-person in New York as well as online. Please join our conversation. A reminder to all of you, this is on the record. And I invite questions from our members here. Let’s go to the very back row. I saw a quick hand there. As always, sir, please identify yourself, and make it a good, sharp question, rather than a long-winded speech.
Q: Oh, well then. (Laughter.)
VAITHEESWARAN: Right. Very well. (Laughter.) I appreciate your honesty. You must be a—you must be a professor. Yes, go on. Sir.
Q: David Wagner—
VAITHEESWARAN: Sir, the microphone is behind you. Go ahead. The gentleman with the microphone.
Q: Hi. My name is Hall. I’m an AI Health Tech founder.
I have a question about talent in terms of AI. A lot of times I hear schools and academics identify that state supply of international students is the best approach for continuing AI talent in the U.S. Granted. Just I want to think about that in reverse, or flip it on its head. Is there a reasonable way to get more domestic students interested in AI? Or is that irrelevant in the future, as they’re just going to be vibe coding? So just want to get your thoughts about how to promote domestic talent in AI, vis-à-vis the international—
VAITHEESWARAN: Thank you.
LANDAY: I mean, the supposed shortage of STEM majors has been a major problem in the United States for thirty years, maybe forty years. So this is not new with AI, Like, this was already going. So obviously we need to improve education at the earlier levels. A lot of studies show that, you know, if kids kind of turn off to math at, you know, something like fourth or fifth grade, they’re less likely to go into engineering and science. So this is just, you know, a big problem that’s existed before AI. Will AI get more people excited to go that way? Maybe. But given already that the jobs were already there for computing, and although computing majors are a big percentage in many places—at Stanford the largest major for us is computer science with 24 percent of undergraduates. But, you know, we’re a small school that doesn’t, you know, produce a lot. With all these schools, we still were estimating large—you know, small gaps compared to what’s needed.
China, on the other hand, is producing huge numbers relative to this. Now, it is possible that we’ll see this canary in the coal mine of augmenting programmers, which we already see 20 to 50 percent improvements, will lower the need for those. But I actually think we’re going to see more, because all of these new ideas and applications for AI are going to actually increase the demand for that. So we’re still going to need that talent. And no one has yet solved that problem of how to increase domestic students going into science, technology, engineering, math fields.
ELBAUM: And let me just briefly add, I think there is a differentiation. At the undergraduate level, computer science programs have attracted growth dramatically in the last ten years. I think where we lack students is really at the graduate level. The number of graduate applicants from the U.S. is very, very small, in part because most of the undergraduates have a job, a great job, as soon as they graduate, and they go out there. And that’s where a lot of the—lot of the—most of the graduate students at my institution, at the University of Virginia, they are from other countries. And those are the students that we take four or five years to prepare to earn their Ph.D. and to either start a company or lead some of these companies on the technical side of these companies. Right now, we don’t have the pipeline into the graduate program. That right now it’s pretty much empty after they finish their undergraduate degree.
VAITHEESWARAN: So it’s a tough challenge.
ELBAUM: Very tough.
GIL: Yeah. I will just add very quickly, if I may, that a program—a type of program, that has been incredibly effective to attract students to STEM fields and to AI in particular, are research experiences for undergraduates. So when undergraduate students see the value of working on advanced topics, big questions that nobody understands yet, and that it’s possible to be part of that, they really change their careers. And we see this not just for computer science undergraduates, but the medical student career type undergraduates, the undergraduates in communications, public policy, the undergraduates in economics. Starting to imagine, how can they combine their interests with more of this kind of technology. So these programs, unfortunately, this year have been hurt a lot with, you know, the current, you know, situation. But these are very effective. And I think the U.S. does this like no one else.
LANDAY: Another angle on this also is, at least in the U.S., unlike Asia or Europe, is the dearth of women and racial minorities that go into computer science. And so, you know, that’s a whole untapped set of people that a lot of us have worked for years on trying to increase representation. Both because we think it’s right, but also that’s just a huge percentage of the population that’s being left out as potential innovators in the field. And we know a lot of research that shows the diversity of teams leads to more innovation. So it’s really important to our future there as well.
VAITHEESWARAN: So it’s a great potential advantage, but there’s a lot more work to do, is what I’m hearing. Let’s go to another question. Let’s go right to the front row. And we’ll move around the room and we’ll look for questions online as well. We invite you, members, online to contribute questions.
Q: Thank you very much. Joseph Gasparro, Royal Bank of Canada.
We talked earlier about cultural norms. And if you look at the internet, we have the Chinese internet and we have the Western internet. Do you think in five years it’s going to be the same thing for AI, where there’s a Western AI and then a Chinese AI? Thank you.
GIL: So I will give you data, as the AI Index does. There is a model that was a large language model trained with Korean data, so that the model is natively learning Korean culture, Korean standards, and Korean way to look at life. And they’ve also developed a benchmark to test models on Korean attitudes, Korean opinion. And so this shows you a little bit of the desire of different regions, different cultures, different countries, to develop AIs that really speak to them. And, you know, those don’t have to be the same that come from particular origins. So I think that we’ll see a lot more of this.
LANDAY: But even that’s a research problem, because the performance at the top requires a scale of data that a lot of smaller countries do not have the scale of that data to build. So a lot of these models are going to more likely be hybrids, where it’s built on top of a model that’s based on Western data, and then additional data from that particular country or culture is kind of tuning in a different direction.
VAITHEESWARAN: So in this one area, China, of course, has a lot of data. Has a lot of people and a lot of digitized data. So that could be a source of advantage for China.
I see a question here in the second row.
Q: Hi. I’m Kate Aitken. I lead AI strategy in the corporate engineering division at Google.
And I think this is the first conversation about AI I’ve been a participant in today that didn’t mention agents or Agenta capabilities. (Laughter.) So I was wondering, particularly for James but for any of you, what are your predictions for agents, not only in 2025 but beyond? Specifically, what kind of work do you think human-centered AI agents will do? And what will it feel like for users to interact with them?
VAITHEESWARAN: It’s the year of the AI agent. It’s being proclaimed everywhere, including here on the floor of the CFR. Help us make sense of it.
GIL: I will say that agents are not a new thing in AI. We’ve been working on those for two decades.
VAITHEESWARAN: For those who are not experts on the call, please, in a sentence or two, tell us what an AI agent is.
GIL: Yeah. So there’s many kinds of AI agents, as you might imagine. But the idea is that it’s an AI system that has a certain level of agency, or independent, or power of doing some computing, or reasoning, or flagging, or any kind of activity that you ascribe to that agent. But there’s a certain level of independence in it. And basically—
VAITHEESWARAN: So you give it a task, it will go away and do it rather than having you be involved heavily. Is that the idea?
GIL: Yes, for some definition of what that task is. It could be just flagging things. It could be alerting you to something. So they don’t do anything on their own. They’re simply alerting you. But they have this agency of deciding what to alert you about or not alert you. So they don’t necessarily have to take external actions without your consent. But there’s a long history of agent technologies in AI. And I think they will be brought to bear to this conversation of, you know, agent-based language models, or agent-enhanced language models. It’s a—you know, you see a lot in the conversation of initial agent technology. So it brings me back twenty years on what we were thinking about at that time. But there’s a lot of different aspects.
There may be many alternative agents that can do a task for you. And so there can be a cost. And so now you have to decide which one to use. Or it could be that if you’re using three agents, you can invest more here and less here, and so the combination that you choose is different. So we worry a lot about multiagent systems. If you have a lot of agents, there’s a lot of failures. So how do you recover from a failure? How much do you redo? Who’s responsible for alerting what? Is it a centralized control or decentralized control, which has some advantages? So all of those conversations, I think, will be accelerated, because there’s so much AI technology in this area.
What’s really interesting to me is that, you know, language models are one technology. Computer vision and image models are another technology. Self-driving cars are another technology. Agents are another technology in AI. And we have so many—the world of AI is really broad. And there’s a huge portfolio of AI technologies across the board that that, you know, many people don’t know about. And they’re relevant to a lot of our future using AI. So, we’ll keep seeing these new ideas, or supposedly new ideas, come out.
VAITHEESWARAN: James, you wanted to jump in?
LANDAY: Yeah. I will just say agents are kind of a broad area, going all the way from, you know, this historical use in AI as more of an architectural way of building AI systems, by having these separate parts that can do things on their own and have multiagent systems that communicate and do a better job by being good at their specialized task. But it goes all the way to more of the user interface of how we interact with these systems. So we go to an agent that somehow represents you and does things for you. So think of—the famous example is a vision video created by Apple called the Knowledge Navigator thirty-five, forty years ago, where it was almost, you know, your personal assistant. And it would help you do everything. And it was like the most great personal assistant, that anticipated what you needed and what you were trying to solve.
And so for some people, agents means that. And for others it’s kind of the architecture. And these two things are actually both happening at the same time. So I believe the first, kind of the architectural, is what a lot, like, kind of companies are doing now. But I think the second is more of how we are going to interact with a lot of these intelligent systems in the future. We’re going to interact with them in the multimodal ways that we do with humans, using gesture, voice, pointing, and typing, and using direct manipulation, altogether in one system. And our agents are going to be who we’re interacting with.
VAITHEESWARAN: Great. I saw a question in the back, the gentleman in the blue shirt, I think. Penultimate row.
Q: Thank you very much. Adam Wolfensohn, Encourage Capital.
Dario Amodei from Anthropic was here a few weeks ago, basically like Paul Revere saying AGI is coming. Artificial general intelligence is coming in the next one to ten years. And society is totally unprepared. And my mission here is to, you know, tell everybody about this. And if you—you know, so my question is, A, do you buy that argument that it’s coming and we’re unprepared? And what would it actually look like if we knew it was coming to be prepared? What should society be doing to be prepared for AGI? Thank you.
VAITHEESWARAN: Anybody want to reveal the secrets of the future? (Laughter.)
GIL: I would like to say that I think society was completely unprepared for the web. (Laughter.) The web happened to us. There was no planning. There was no warning. There was no—you know, and we all learned to live with it, the good parts, the bad parts. There’s a lot of bad parts to it, but there are of amazing parts of it. It’s enabled so many other technologies, and so much information, and so much advances in in every aspect of our lives. So AI is going to do the same. I’m not sure, you know, when or how, or how do we prepare for it. How did we prepare for the web? We didn’t, right?
So in this case, I think what I said before stands, which is to be as informed and cognizant as we can because whenever AGI happens you’re going to see one of these benchmarks or graphs saying, you know, OK, we pulled the switch. That’s it. And then, you know, that’s a number. What does that mean to you if you don’t understand the basics of what’s happening today with AI and you haven’t experienced it? Then it will be a completely foreign number to you. So the more that you can experience, connect, learn, be in forums like this. There’s a lot of misinformation about AI out there. So you want to learn from sources where you really see, you know, the data and grounded opinion, that’s very important.
VAITHEESWARAN: Anyone else on AGI specifically?
ELBAUM: Yeah. I think that you—you know, everyone’s perspective is affected by their vision, right? So if you think about someone like him, is heavily influenced by their system Sonnet, and that is really good to the support programmers. And it is pretty close to being able to generate programs and test—test the program and do patches on the program. And so it feels a little bit recursive, which has this flavor of AGI, right? I write the program, I write the test, run the program, improve the program, patch the program, and so forth. So, it has that feeling, in a sense. So I can see where that’s coming from.
But, you know, the question is, how do you actually project from that to AGI in every field? I cannot right now imagine that happening within the timeframe that he’s talking about. But I can see that happening within the scope of, you know, some software development within particular constraints. But even there, he’s talking about programmers. He’s not talking about how you put these components together. What are the performance effects, the conflicts? So I guess I would take it with a grain of salt right now.
LANDAY: I may be the real skeptic. I think AGI is a red herring. (Laughter.) It’s used by people to pump up their valuation and their stock price. It’s not a scientific term. It doesn’t have a precise definition that we could even tell you when we reached it. You know, common—
VAITHEESWARAN: Even AI didn’t have a precise definition. And there are people moving the frontier of it.
LANDAY: Yeah, well that’s a problem too. But a common definition is, you know, AI systems that are better than humans on most cognitive tasks. They don’t even define what’s most, you know? I would claim, depending on how we define this, we’ve had AGI since the 1950s. You know, computers are much better at figuring out your utility bill and sending it to you than I am, as a human, right? So it depends on how you define these things what it means.
And I will claim it’s just like that race. There’s no line. There’s not going to be some line crossed. Oh, that’s AGI. Everyone go home today. What we really need to pay attention to is these things will continue to increase in capability. They’ll continue to be involved in our lives, and we use them. But even today, the systems we have, with no AGI, can do harm—whether it’s deepfake porn that’s often targeted at women, whether it’s disinformation, whether it’s bias. So we already have a lot of problems we need to focus on now without getting scared about some science fiction future that may never actually occur.
GIL: If I may, I’ll tell you today’s problem. Today’s problem is that we have superintelligent AI models for medicine, for some areas of medicine. I was just learning about a model that just came out for pathology images. It can tell you which stage of glioblastoma—a cancer of the brain—which stage the cells are in, which is impossible to tell virtually, I am told. What do you do with this superintelligence when it comes to cancer diagnosis?
How do we include that in our medical practice and in our medical workflows? Is it an accurate enough language—sorry, not language model—an accurate enough foundation model about this? And we see a lot of these very powerful models are popping up in medicine, but also in science. And I think in this year’s report we already mentioned a number of science foundation models that are being built by governments, academia, in partnership with industry, sometimes international partnerships to build these models. Those are exciting. And those are here today. And they’re doing amazing things. And what are we doing about it? So, you know—
VAITHEESWARAN: You mean we’re not adopting them quickly enough?
GIL: We’re not adopting them quickly enough. That’s today’s problem.
VAITHEESWARAN: There’s some resistance from physicians, and such, right? There is a human dimension where there’s a reluctance to embrace some of—
GIL: But this is today’s problem, right? We have these superintelligent models—
VAITHEESWARAN: It’s not artificial intelligence, it’s human stupidity. (Laughter.)
GIL: We have these superintelligent models. And so how come you keep going to the doctor and they’ve never heard of these things, and they’re not part of their practice, right?
VAITHEESWARAN: We’ve had some patiently waiting online. Let’s go to our member who’s online with a question.
OPERATOR: We will take our next question from Dan Crippen. Mr. Crippen, are you there? Seems like we’re having technical difficulty. We’ll go to our next question, from Alan Charles Raul.
Q: Thank you. Hi. Alan Raul, practicing lawyer and also a lecturer at Harvard Law School.
My question is about regulatory policy. AI governance debates turn, as they should, on cost-benefit analysis and risk assessments. And to the extent that that involves identifying problems, as you’ve just been discussing, they should be evidence-based problems. Have we seen—have you seen, to this point, any serious safety issues involving, for example, CBRNE—chemical, biological, radioactive, nuclear, explosive—or nonalignment of AI applications with humans? Have we seen those types of problems yet? And is there anybody who is kind of watching in order to develop perspectives on that and data on that, such that risk assessments and cost-benefit analyses can be better informed? Thank you.
VAITHEESWARAN: Sebastian.
ELBAUM: So, I mean, I can give you one feel when that’s happening, and happening a lot, is self-driving cars, right? I mean, there is a lot of—a lot of evidence accumulated by some of the companies that are building these self-driving cars in terms of the effectiveness of their vehicles, violating the driving rules that we follow, or not follow, every day. There are definitely evidence of violations of these cars that are detected by police or incidents that they have on the roads. And a lot of those are caused by the learned components that are part of the self-driving vehicles. And these components may be, you know, a neural network that actually the vehicle is using to understand or interpret their sensor data, just to build the representation of the world around them, just like we do with our eyes.
So I think that’s a good source of evidence. And companies, and to some extent regulatory agencies, are looking into that data to determine what are the risks of doing these. what are the benefits of having the deployments of robotaxis? What are the risk, compared with having an average human driver in terms of their—you know, their potential to cause harm or to have an incident. So this is an area where I see a lot of data. And I believe the report has definitely data on Waymo, as being one of the leading companies in this space.
GIL: It’s very hard to get good data for self-driving cars. What we would like to see—
ELBAUM: Yeah. Not for the companies, right? You mean for us. The companies do have that.
GIL: The companies have it, their own data, yes.
ELBAUM: Yeah.
GIL: But we’re very interested, particularly in data of near-miss kind of incidents, where a human takes over the vehicle and solves the problem. And we want to know—we would love to know what those incidents look like, and how bad off the AI system was. But those data are not available to us. We would love to have more of this data.
I think the other point to make is that we were talking before about AI in medicine, or AI applications in medicine. The medical area is highly regulated. There’s a process on assessing ethical deployment of new technologies. There are—there’s a big tradition when it comes to deploying new technologies. So we may learn something about how AI is transitioned in the medical field.
VAITHEESWARAN: Let me go—there’s a gentleman who wanted to speak before. Sir, you do still have a question? Let’s get a microphone to you this time.
Q: Sure. David Wagner with Houlihan Lokey.
In one of the first graphs there was a limit at human. And we’ve sort of hit that limit. How do things change? Do we go over that limit? Or is it agentic? Or does it migrate somewhere else? So what’s about that limit that changes things?
LANDAY: You have to understand what the task is. So each of those graphs are on some subset of things that maybe humans do. They’re not in any way representative of all the things that we do. And there’s even ones that are in this report that show great performance by computers compared to humans, with a limited amount of time, but then when you give humans more time their performance is much better than these systems on harder problems. So many of these you have to take with a grain of salt, that what does the benchmark really show? And there’s a lot of hype around these benchmarks, so be careful.
VAITHEESWARAN: Another question? Let’s go all the way in the back.
Q: Hi. Tyler Herman (sp), Anthropic.
I have a question that relates to foreign policy. It’s a bit more tactical than the larger questions we’ve discussed, but hopefully there’s a direct response from the panel. Cloud computing and big tech in their first wave pushed a lot of developed economies to set up data residency requirements. So companies, especially regulated industries, had to keep data within the region. It’s pretty cheap to set up cloud in many of those regions, so they percolated. However, it’s expensive to run GPU clouds. So today a lot of the models are capacity constrained in most of the foreign markets. Most of the traffic runs to the United States, where energy is cheap. This means that most regulated industries—finance, education, healthcare, life sciences—they actually can’t innovate in those foreign markets because they can’t send their data to U.S. datacenters to be run. So do you think that data residency requirements in foreign countries should be changed to encourage innovation, or not?
VAITHEESWARAN: Who wants to take it on?
GIL: I will say that I’ve done a lot of work with scientists in different disciplines. And giving up your data is very hard when you have invested tremendously in collecting it, curating it, understanding it, and improving it. And so I understand the desire to have stewardship over data that has been collected. And I’ve seen that at the individual scientist level, at the science group levels. I’ve seen that in entire nations in terms of science data that they’ve collected. So I understand the need for stewardship and keeping the data where it was collected.
VAITHEESWARAN: Could it be that—I’ve seen a proposal put forward by some that we may see regional clusters. In the Gulf, it’s emerging already. We’re seeing maybe Singapore for Southeast Asia, and so on, that even if—you know, a way of compromise—of finding a compromise between the tension between the investment. We’re not going to have 200 countries each with their own mega centers for producing AI, as it were; not enough GPUs in the world to do that. Is there a plausible middle ground? Or you don’t think that’ll happen either?
GIL: Possibly. But you’re right, there’s very large investments coming from industry to set up large datacenters, incredibly large data capabilities—datacenter capabilities in countries all over Southeast Asia, the Middle East. Yeah, it’s possible, what you say.
VAITHEESWARAN: Let’s see if we can get one last question. There’s a lady in the very back row. If you make it a quick question, I’ll ask for a quick answer, and then we’ll have to wrap up.
Q: Sure. Hi. My name is Lauren Wagner. I work on AI policy at RAND Corporation and for ARC Prize Foundation, also with early-stage startups.
I was wondering, if you could wave a magic wand and could build a new institution that could help ensure that AI goes well, what do you think is needed? Whether it’s an auditing regime, or something else?
VAITHEESWARAN: OK. This is an excellent question. Thank you. This is going to be a lightning round. Give you each one wave of your wand. Let’s start with you here, James, and we’ll head down to Sebastian to finish us off.
LANDAY: So I already think we have the pieces of that—academia and industry. Neither of them, I think, can do this right, right now. And I think we actually need to build hybrids that have some of the best of the openness of academia, but with also the talent and computation of industry. So create kind of a third way of doing this, where they’re open—open IP, open safety, all of this. Because I, again, believe open models are what’s going to—
VAITHEESWARAN: You really waved that wand, didn’t you?
LANDAY: She gave me the wand. (Laughter.)
VAITHEESWARAN: Disney-style. OK. Great. Yolanda.
GIL: I couldn’t have said it better. I’m going to add the government sector as well, because I think that government systems and AI capabilities in government are incredibly crucial. And the nuance that I would add is that when we think about industry, there’s many kinds of industry. There’s the big tech, the small tech, the medium tech. And then there’s the health industry, and other kinds of industry sectors—transportation, et cetera. And I think that the needs are very different. And they’re a lot less connected to universities. And we need to solve that.
VAITHEESWARAN: Sebastian, you get the last—
ELBAUM: More concretely, for me, is I would—I would give the Safety AI Institute, which we already have in government, I would give that the cost of building—of training a large language model. But I would give their budget of what it takes to build today a large language model. So move their budget from 20 million (dollars) to 250 million (dollars) a year. And let’s coordinate this team of academics and industry with that government organization, which is already in place.
VAITHEESWARAN: OK. Well, you see a very different vision for the future than what we’ve seen in the real world thus far, at least in the United States, but you had asked. So that’s the magic wand. Thank you for provoking us.
I think you’ll all join me in thanking Sebastian, Yolanda, and James for speaking with us today, and for hearing about this fascinating report. Please give them a nice round of applause. (Applause.)
ELBAUM: Thank you.
VAITHEESWARAN: Please note that the video and the transcript of this session will be posted on CFR’s website. Thank you all and good evening.
(END)
This is an uncorrected transcript.