Behavioral Insights Into Policymaking

Tuesday, April 18, 2017
Brendan McDermid/Reuters
Jeffrey Liebman

Malcolm Wiener Professor of Public Policy, Harvard Kennedy School

Maya Shankar

Former Senior Policy Advisor, White House Office of Science and Technology Policy

Zachary Karabell

Head of Global Strategy, Envestnet, Inc

Experts discuss behavioral insights into policymaking.

KARABELL: So welcome to the second part of our discussion for the Robert Menschel Economics Symposium. This is more of the let’s flesh out some of what we heard from Daniel Kahneman.

You’ve heard the sort of Council on Foreign Relations dos and don’ts. This is on the record. I know there were many things that might have been said about behavioral economics off the record had it been so, but unfortunately everything we say will in fact be on the record. So that deeply constrains our conversation, but so be it.

We’re also being live streamed to whomever out there in the greater—as said, to whomever out there—(laughter)—in the greater ether, so welcome.

And we will obviously have questions at the end, which will also be on the record.

My name is Zachary Karabell, I’m head of global strategies at Envestnet which is a publicly traded financial services firm. I’ve written a bit about economics and statistics and history. And I’m also a journalist and a commentator. So I am here moderating this, although I don’t have a background in behavioral economics.

And the three panelists’ bios are in your programs as per CFR. And you can read those more about the impressive credentials, but just briefly we have Jeff Liebman who is the Malcolm Wiener professor of public policy at the Kennedy School at Harvard and served two stints in the U.S. government in the Office of Management and Budget most recently and in the White House in the winding years of the Clinton administration. To my left is Elspeth Kirkman who is senior vice president for The Behavioural Insights Team which is based in New York and does a variety of consultancy work on behavioral economics for both policymakers, cities, private foundations and was prior in London doing similar work. And joining us with our now effective technology—we had a little bit of a glitch, but it’s now working—is Maya Shankar who I think has a lot of experience on effective and defective technology having worked at the White House Office of Science and Technology and was the first head of the behavioral insights group in the White House that was mandated by President Obama. And I think we’ll hear a little bit about what the future or lack thereof is of that approach to policymaking in subsequent governments.

So we’ve talked a bit about the general arc of behavioral economics and behavioral insights into policymaking. This is a recent phenomenon within the policy community, which has gathered steam fairly quickly. And to the degree to which it’s I wouldn’t say, and obviously you all have much more experience in this, that it is deeply embedded in any developed world bureaucracy, but it is becoming part of the understanding of how to make policy both at a national level and perhaps, and Elspeth will talk about this, a little more deeply embedded at the local level, about how policy can tweak and nudge ala Richard Thaler, how it can deal with all these issues of modeling and information bias and confirmation bias and all the things that Kahneman and Tversky and their academic children and successors have also developed more fully.

The OECD recently put out, I think this year, a study of about a hundred cases of policy that has been informed by behavioral economics and behavioral insights and what the legacy of that is. It’s an interesting resource if you want to look at an assessment of this. But really, much of this has been in the past years, not even the past decade.

So, as Zhou Enlai apocryphally or actually did say to Kissinger when asked about the French Revolution, we can say about the efficacy of behavioral economics and behavioral insights in policymaking, probably too soon to tell, although I hope we don’t have to wait, you know, 200 years before we figure it out.

So first to Professor Liebman. I can call you Professor Liebman now that you’re at the Kennedy School, professor of the practice, professor of economics. You were in government in the late ’90s. You were in government more recently in the teens. And how did that landscape change in how behavioral insights and behavioral economics was actually used as an applied tool for policymaking?

LIEBMAN: So I think it was really completely different between the Clinton and the Obama administrations, not because Obama was somehow cooler or because he happened to know Richard Thaler from the University of Chicago, which he did, but simply that the advance in behavioral economics was so great in the interim.

You know, if you went back to the late ’90s, behavioral economics was mostly a collection of a set of anomalies. We had a bunch of examples of places where people behaved in ways that they weren’t supposed to behave according to our models, and it was sort of cute and interesting and curious. And what happened really between I would say the mid ’90s, late ’90s, and, say, by, you know, 2007 or ’(0)8 when I was going back into government, was behavioral economics evolved into a science where we can make predictions. In certain circumstances, we can predict how people will behave pretty accurately because we’ve seen the same things happen over and over again.

And we’d also seen a bunch of situations where people or firms had taken the insights of behavioral economics and actually produced much better results, and as it came up in the earlier session, the example of firms defaulting people into defined-contribution pension plans was the strongest one, yet by the time that we were starting work in the beginning of the Obama administration, all of the economists on that team were well-aware of lessons from behavioral economics. And I can’t think of anything important that we did, certainly in the two years I was there, where we weren’t incorporating those insights in some way, whether it was designing the Recovery Act or designing the Affordable Care Act or thinking about how to further encourage broad retirement savings for Americans. And so it was a completely different environment, but, again, not because it was something different about us, it was that the science was not ready to be used by policymakers.

KARABELL: So, if you had used some of these tools in the late ’90s, do you think there would have been different policy outcomes, better policy outcomes?

LIEBMAN: Let me give you one concrete example. Late in the Clinton years, the budget surpluses emerged for the first time, and that was a remarkable event. And we were trying to think about how one could use them, and one of the problems we were trying to work on was retirement income security and the problem that something like a third of Americans when they get to their late 50s have essentially no retirement savings.

And the policy remedy we came up with in I guess it was either ’98 or ’99 was that we should set up a way to match the savings of low-income taxpayers so that they would have the same kind of incentives to save that higher-income taxpayers who work in firms that have very generous 401(k) plans have. And later research, including a randomized experiment that I was part of the team that conducted a few years after that, but that policy, by the way, never got passed, but that was the solution we were working on. Later research suggested that maybe if you worked really hard and really matched savings for low-income people, maybe you could raise their saving rate from 4 or 5 percent to 15 percent, the number of people making savings in a given year. And so, in that era, that was the policy response.

By the time we got to the Obama administration, we knew that defaults could get people, up to 70 or 80 percent of the people, saving. And this idea of matching people’s savings, which is, by the way, expensive from a budgetary standpoint, you know, you have to raise more revenue to go out and do the subsidies, so, you know, was, you know, was just it was clear there was a much better option and, by the way, a cheaper option that would probably have four times the impact. And so you can just see that completely different policy options were before us because of the learning that had gone on during the decade in between.

KARABELL: So I want to turn to Maya now. By the way, with all the discussion of artificial intelligence and ghost in the machine, I know you’re there and I know you’re a real person, but I feel like this is what it’s going to be like if we’re actually talking to an artificial intelligence interview on a panel years from now.

So you joined the White House, you’re in the Office of Science and Technology Policy, and then I guess 2015, the use of behavioral insights and behavioral economics becomes sort of codified as this will now be part of the policymaking process. Maybe talk about how that came about. Or, you know, I know that there certainly was the Cass Sunstein initially in the White House and was trying to apply some of the nudge ideas in practice, but maybe talk about how that became ever more a part of the policymaking process.

SHANKAR: Yeah, absolutely. So I joined the Obama White House at the beginning of the second term. And by that point, Cass Sunstein had served as administrator of the Office of Information and Regulatory Affairs for several years. And he really brought a unique lens to that particular role by looking for applications of behavioral science to policy.

So I had seen already a strong precedent for these insights being successfully applied to public policy, and there were also a number of government agencies, like the Department of Labor, the Department of Health and Human Services, the Department of Education, who had all also successfully applied behavioral science—(audio break)—lunches or trying to think about how to devise student loan—(audio break).

And so I think, Michael, why I joined was to try to institutionalize this work so that we were applying behavioral science to policy in a systematic way that involved rigorous experimentation in ways where we could quantify the impact of our applications, figure out what was working, what wasn’t, what was working best. And so I made it my goal to create a team really modeled off of the U.K.’s Behavioural Insights Team because I had seen it obviously be quite successful in that institutional form.

And so we pulled together a team. And in our first year, I wouldn’t say we necessarily had a sunset clause like, you know, the U.K. Behavioural Insights Team, but I would say we were certainly hanging by a thread when it came to having an identity within the federal government. So we really had to prove that this stuff is valuable.

So, in the first year, we developed about 12 pilots with government agencies. Each of them involved a randomized control trial so that we could quantify impact. And I think that those wins helped solidify the importance of this work as something that, you know, leadership in the White House should take very seriously and should try to institutionalize.

So, by the time 2015 came around, armed with these successes, President Obama signed an executive order that not only institutionalized our team, but also issued a broader directive to federal agencies to use behavioral science and rigorous evaluation methods as a matter of course, as a matter of routine practice within all of their operations.

KARABELL: So you talk about a bunch of test cases and successes. Maybe what are a couple of those specifically?

SHANKAR: Yeah. So one example was we were trying to get veterans to take advantage of employment and educational benefits that they had earned through their years of service abroad. And typically, the VA relied on word-of-mouth practice or other methods of communication, but there was very low take-up.

And so what we did is they had an outbound email message that was about to go out to veterans saying that these veterans were eligible for the benefit. And we just changed one word in the email. We said you’ve actually earned the benefit, you’ve earned the benefit because you’ve been in service for many, many years and we’d like to basically give you these benefits as a way of acknowledging your service. And we found that that one word change led to a 9 percent increase in access to the benefit. So that was one of the quicker wins.

We also in the area of government efficiency tried to get more vendors to honestly report their sales to the federal government because they had to pay a fraction of that amount in fees, in administrative fees. And so, because typically this relied on self-report, we were finding that vendors were often underreporting their sales. So we changed the form to include an honesty prompt at the top of the form where vendors had to acknowledge up front that what they were about to fill out was truthful and comprehensive. And we know from behavioral economics research that if you require people to sign it at the bottom of the form it’s too late because they’ve basically already lied and they’re not going to go back and change the values. (Laughter.) But if you had them sign up front before they filled out all the information, they’re sort of primed for this honest mind-set.

And just introducing the signature box at the top of the form led to a $1.6 million increase in collected fees in just a three-month period. So, if we scaled it up, which we did, and the effect persisted, that would be about $10 million of revenue for the federal government just because of a signature box at the top of an electronic form.

And then finally I’ll just give one other example because Jeff mentioned retirement security. Prior to 2018 when a new legislation will kick in that automatically enrolls new military recruits, military service members have not been automatically enrolled into retirement savings plans in the way that civilian members have been. And so we introduced an active choice at military bases. So when service members were coming to orientation, they were already filling out paperwork, already doing drug and alcohol abuse counseling, they had a bunch of things to sign, we basically slipped in a form around the Thrift Savings Plan, which is the federal government’s workplace savings plan, so we had a quick lecture that happened. And as part of orientation requirements, the service members were required to select yes or no, saving for retirement. And we found that this active choice prompt led to a roughly 9 percentage point increase in sign-up rates.

And so that was another example of a more—you know, sometimes with—the ideal proposal, in this case it’s automatic enrollment, isn’t going to be possible for some time, so behavioral economics provides them interim tools that we can use that preserve freedom of choice, but are also aggressive enough to have an impact. And that’s what we leveraged.

KARABELL: Fascinating. And in order to do this session, I had to sign a form for CFR, a release form saying that it was OK to be on the record. The signature was underneath at the bottom of the page, but it was before the session, so I don’t know which of those takes precedence. (Laughter.)

SHANKAR: Well, I signed a form saying I wasn’t an AI right before then. I guess I’ve revealed that now.

KARABELL: That’s good to know. Although if you were, you’d probably, presumably, be able to fake it. (Laughter.)

So, Elspeth, you were nicely teed up before, the work of The Behavioural Insights Team, as being a sort of a leader in this space. How did you get into that? Why did the British government initially underwrite this? And is there a cultural difference in how behavioral economics is applied in policymaking? Or are these tools really the same and it doesn’t matter whether it’s Westminster or the White House or France, it’s the same set of tools, just different particular problems?

KIRKMAN: Sure. So, on the first part about how we kind of came to be, it was very much a kind of flagship idea and a flagship team within the coalition government that was brought in in 2010, and I think similar to Maya’s experience. I should tell everyone as well that Maya is on a screen, weirdly kind of directly in front of me, so I’m kind of talking to her. (Laughter.)

So, similar to Maya’s experience, I think everybody thought that we were a little bit quirky, and that’s maybe a generous way of putting it. And we had the sunset clause, which was that we had to basically recoup a return on investment greater than the cost of the team in order to continue to survive. And maybe a sunset clause is just a smart way of hiding the fact that you’re hanging by a thread of credibility before you get good results.

So it was a time of austerity and there were all of these kind of measures around, you know, no increases to headcount, freeze on government spending, all of this kind of stuff. And so the very opportunistic, simple thing that we were able to do was to apply this stuff almost immediately to raising revenues for government. So a lot of our kind of flagship initial work was with the Revenue and Customs Agency. And we’ve got some really simple now, pretty kind of tried-and-tested, well-rehearsed examples of things that we’ve changed, simple lines, for example, on tax letters to collect delinquent payments, telling people nine out of 10 people in the U.K. pay their tax on time. Just that simple line really kind of makes a very big difference in terms of how much people pay.

And the small tweaks like that that we’ve made to tax letters, again, evaluated through experimentation, randomized evaluations, over the course of one fiscal year they added to up a couple of hundred million pounds in additional revenue brought forward, which kind of grabs people’s attention in a time of austerity and makes them think, OK, maybe I thought you were, you know, slightly quirky, bizarre, a little joke outfit, but maybe I’d like a piece of that for my policy area as well.

And then actually, that point on how this translates and whether there are kind of differences, I think for me the main differences are about how you kind of frame this and position it within the agenda of a particular administration or a particular kind of set of government services and departments. But in terms of the insights themselves and this idea of, you know, humans being wired to make these sort of very predictable kind of shortcuts in the way that we make decisions, a lot of that translates very well. So the idea of social norms, telling people nine out of 10 people pay their tax on time, the reason that works is that we’re all wired to think, OK, I can’t make every single little micro-decision I’m faced with every day, so a really good kind of substitute for me making the decision is to just look at what other people are doing and just follow that. And we like to think that only kind of 13-year-olds use that logic, but actually all of us adults do that.

And there are situations in which that works and which it doesn’t, but we see pretty consistently that in the situations where the gap between people’s expectation of what other people are doing and whether or not they’re conforming, for example in paying tax, and the actual behavior of other people, when there’s a gap and you tell people about that gap, they’re very likely to start complying, whether they’re, you know, Singapore or Guatemala or the U.K. or the U.S. or all these others places that this has now been tried and tested.

KARABELL: Sort of a broad question I think for all of you to address, which stems a little out of a point that Jeff had made. There’s the quantifiable, right, that we need to prove that these tools as applied to policy either save outlays of government money or generate more revenues in terms of what is taken in, or, as Maya talked about, being able to actually distribute money that’s been allocated or programs that have been put in place, but aren’t being well-utilized. What about the nonquantifiable? I mean, is there an issue? And if you go too far in the direction of having to prove the efficacy of this by numbers that maybe you don’t get to, whether it’s policy as shaping animal spirits, right, people being confident, which you kind of know by behavior, but you don’t necessarily know by numbers, or just positive social outcomes, or is that asking too much right now?

I mean, so maybe each of you could talk a bit about that. Is it asking too much to say, well, this is going to improve policy and social outcomes as opposed to we can prove by the numbers?

LIEBMAN: I think we’re seeing the insights from behavioral science and behavioral economics being used in both contexts. We’re seeing very specific A/B testing of different wording and seeing big impacts from that. But I really think we’re seeing big policy decisions being informed by these insights as well.

And just to give you another example, when we were designing the Affordable Care Act—sorry, I was going to do a different—when we were doing the Recovery Act, we were trying to figure out how to get as much money into people’s hands and get them to actually spend it rather than save it, as possible. And so one of the components was called the Make Work Pay Tax Credit. And the question was, how could we design a tax credit that would maximize the impact on aggregate demand, the amount of spending that people would do out of it? And so we did two things that were informed by behavioral science.

The first is we decided to make it a rebate against people’s payroll tax, so it was people getting their own money back again as opposed to some big bonus. Because we thought if people got told they got some big bonus, they might think this is a weird one-off thing, and in their mental accounting they might say, well, I’m going to save that for a rainy day or something. But we wanted it to feel like they were getting their own money back. So that was the first thing we did.

And then the second thing is we decided to have them get the money by adjusting the withholding tables in the tax schedule so that it would just show up in their weekly paychecks or their monthly paychecks and they wouldn’t even notice that they got it. And because so many people simply spend everything that comes in in their paychecks every month, that would maximize the extent that it was spent rather than saved. And so we did that, and we think that was, you know, was the right way to get it, to maximize the fraction of that tax cut that was actually spent.

Now, most people think that we committed political malpractice. Because it was hidden in the adjustment of the withholding table, the president didn’t get credit for sending checks to everyone. And so while I think we did the right thing on the economic side, many people think that as a, you know, a matter of giving the president credit for rescuing the economy, this was exactly the backwards way to do it. But I’m still, you know, proud that we did what was good for the economy on that one.

KARABELL: And, Elspeth, then we’ll go to Maya, what are your—

KIRKMAN: So I’m just enjoying political malpractice. (Laughter.)

So on the kind of—to the point about the sort of unmeasurables or the difficult things to kind of quantify, part of our kind of reason for being is to sort of fit into the existing policy environment and accomplish outcomes and, you know, measure the things that were already being measured and that already count in certain ways, so I think sometimes we have to be really smart about how we do that and find good proxies, for example, particularly if we’re looking at policies that may have sort of long-term effects.

So, for example, we might do some evaluation of something like body-worn cameras in policing, try and look at what the kind of long-term outcomes are in terms of, you know, whether you get kind of better, more fair policing, whether people’s relationship with the police becomes better, whether you get higher social trust. All those things take a long time to ripple through. But it would be very easy to neglect looking at other things that might ostensibly be influenced by wearing a camera, so we might look at things like police absenteeism or, you know, well-being scores on police staff engagement scores, which are clearly very predictive of whether they’re going to burn out, whether they’re going to kind of, you know, have all sorts of other issues that end up costing the public purse quite a lot of money.

So I think there’s the fact that we need to be kind of, you know, clearly we do need to measure this stuff, we can measure most of this stuff if we’re smart about it and also the way that we approach things. And I think what’s kind of, in some ways, quite prosaic and, in some ways, quite kind of the opposite in terms of this work is that we’re not grappling with a huge kind of top-level issue, we’re not trying to sort of wrap our arms around things like how do we reduce unemployment all in one go. We’re trying to really chip away and break down that problem and say, OK, we can’t kind of change the sort of, you know, the direction of the entire economy, but maybe we could look at whether a small kind of inefficiencies in terms of whether people looking for work are choosing to go to the recruitment events that are most likely to land them a job. So we’ve done a lot of work on this.

For example, in the U.K. we’re simply changing the language around a government-sent tax message to job seekers, telling them that they could go to a recruitment event and where they were actually very likely to get a job. Just changing the language, making it more personal and adding the line “I’ve booked you a place, good luck” from their job adviser got people from a 10 percent show-up rate to a 26 percent show-up rate. And that’s this tiny little tweak on something that you would think, yeah, of course, you can’t kind of muck around at the bottom level and change employment outcomes, but when people are actually landing in jobs as a result of it, it turns out that you very much can.

KARABELL: And, Maya, any thoughts on this sort of mix between what can be measured?

SHANKAR: Yeah. To add to those very good perspectives, I think there are also instances where it’s simply not appropriate to be measuring or testing outcomes.

So one example of this, which is a problem my teammates and I worked to tackle in our final year, was the water crisis in Flint, Michigan. And in a case of total emergency like that, you’re not going to be A/B testing messages on the ground, right? You’re going to be leveraging your best understanding of human behavior to design pamphlets that articulate information about water quality and water safety in the most effective ways possible, et cetera, et cetera. So I think, one, there are instances where it’s inappropriate to be measuring because you’re just trying to roll out the most effective messages to everyone.

And then on top of that, there are also behavioral elements that are just challenging to measure generally speaking. So, in the case of Flint, we’re trying to repair broken trust between citizens and their government because the government had lied to them about the quality of their water, and there had been elements of deception. And so in that sort of instance, you know, we’re working on both sides, both with residents of Flint and then also with government officials to try to figure out how we might repair some of these fissures.

And I think in those cases, it’s really a long-term process where you’re not going to repair a trust overnight. It’s very hard to measure increases in trust. Any self-report is notoriously a poor indicator of how people are feeling. And so, in a case like that, we’re trying to leverage trust-building tools with the understanding that this is going to take, in some instances, you know, decades to try to fix, in the best-case scenario.

KARABELL: You know, it’s interesting about where it’s appropriate or not. I mean, this is also an issue of economic policymaking separate from the behavioral. So with the American Recovery Act when President Obama stood up in February of 2009 and said this $787 billion bill will create or save 2.5 million jobs, the problem with that ultimately was by trying to create an absolutely formulaic connection between necessary emergency spending with quantifiable future outcomes, the spending was clearly necessary, the fact that you had to then give a number created a liability, right? Because maybe it saved, maybe it didn’t, but in the time frame that was promised those jobs didn’t materialize in quite the way that was promised, as opposed to being able to say we need to spend a lot of money, things are really bad, and it’s going to take aggressive effort. So there’s always that issue of not only the appropriateness of the quantification, but the liability of quantifying something that maybe shouldn’t be, given that the action was necessary.

One last question, then we’ll turn it over to all of you and your questions—I think we’re going to go to about 3:20 because we started a few minutes late—and that’s the question of how embedded are these rather new approaches to policy, which, in many ways, are a progressive—and I don’t mean that in a political sense; small-C (sic) progressive—very similar to, you know, the creation of statistics as part of government policymaking during the 1930s? How embedded are they within various bureaucracies, such that different political parties and different political fortunes can or cannot stop the clock or interrupt this?

And, you know, there’s been certainly a rise in more nationalist governments, governments that are more—I mean, it may be unfair to say ideologically driven given that most governments are ideologically driven, but may not want the difficulty of constraining action by quantifiable measures. I mean, are these things that are going to survive current administrations in the U.K., in the United States, potentially in France, depending on what happens? Are they deeply embedded enough? Or is it really too new that we could be talking in a few years about, or, wouldn’t it be nice if this had been part of our policymaking? So, in no particular order, your thoughts about that. Are you—

LIEBMAN: I’ll go first. I’m—

KIRKMAN: I want to—(inaudible). (Laughter.)

LIEBMAN: All right. (Laughs.) You’re not. No, I’ll go—I’ll go first. And I’ll say a few things.

And so, in some separate work, Elspeth are partners in crime in trying to help state and local governments use data and evidence more effectively. And I would say I mostly can’t tell the different between working with red governments or blue governments. Everyone’s trying to make government more effective and use data and evidence. And, you know, people want to know whether we’re spending money well. And that’s just as—I don’t find there to be a strong partisan difference in that.

And so certainly at the state and local level, I don’t think that there’s an issue. I think the movement toward using data, toward doing experimental evaluation and other evaluations, I think—I think that’s just—that’s just building. And we’re seeing so many—a new generation of public servants who, you know, are young, have grown up in an era where they have been—where they are more technologically savvy.

And actually, let me just give you a story. I was moderating a panel in the city of Boston about a new effort that they did so that when a firefighter is going to a fire, often the city knows something about that building because the thing—the hazard that actually caused the fire to happen, they had—they got a building permit for. But the data system that had the building permit information, and it wasn’t available to the firefighters on the way to go into the fire, and so they would show up, and they’d be blind, and they would find some hazard that they could’ve known about. So they fixed that problem. They got the data systems talking to each other so that now when the firefighters are going to the fire, they see in an iPad everything the city knows about that property before they get there.

So I was moderating this panel, and there was a city official, and I said, you know, how did you procure that technology, you know, the thing that made the iPad show up with the map and the—hit the—hit the—hit the property, and suddenly you know all that stuff? And I see someone waving from the audience like I’m—I couldn't figure out what was going to—like, making—gesturing. And what turned out to be the answer is just some—you know, some young city employees’ Google Maps. They weren’t even a computer science major, just some—you know, some B.A. in some normal subject. And they were able to quickly do it and set up the technology. And so, I mean, this is—this is sweeping through, you know, government in the same way, probably not quite as fast, that it’s been sweeping through business. And so I think the big picture is these kind of practices are just—we’re just going to see them expand.

And I’ll also say just in the curriculum at the Kennedy School, you know, the core curriculum has behavior economics in it. That would’ve have been true—you know, would’ve have been true 10 years ago.

KARABELL: You masterfully avoided taking about the federal government, so I’m going to ask Maya to opine on that. (Laughter.) That was extraordinarily diplomatic. Well done.

SHANKAR: Well, now that I’ve left, maybe I can give more candid—(chuckles)—answer.

You know, I think we were mindful very early on that no matter how much appeal behavioral science should have for both Republican, Democrat or independent audiences, it would necessarily get attached to the fact that this was a Democratic president. And we wanted to try and make this as bipartisan an effort as possible because I think we’ve seen tremendous enthusiasm from both sides of the aisle.

But I think with that perspective in mind, and especially my keen interest on seeing this initiative persist into future administrations, we made the conscious choice to not create the team of behavioral scientists within the White House. Instead, we had a more diffuse model where we baked behavioral scientists into government agencies across the federal government with a dedicated team within one government agency.

And I think that that has helped sort of root this in a more stable place. So some of these agencies are less susceptible to leadership changes in terms of the type of mission they have or the types of people they’re able to hire. And hopefully that helps. I mean, in some sense, we wanted to insulate it from the particular party that was in power because we felt that these techniques were just generally good for government, right? It led to more effectiveness and more efficiency irrespective of what the policy goals are.

I think at the time I could never have predicted just how significant a change would occur between President Obama and President Trump in terms of ideals and, again, the—I mean, there doesn’t even seem to be a science and technology office right now that exists within the White House, period. And so we definitely have some gratitude that we worked to change the minds of career civil servants who had been in the government for in some cases 30-plus years, who are continuing to work, and who have been trained in these tools and techniques.

KARABELL: Yeah. I mean, certainly, I guess part of the goal would be, like, various official statistical agencies, which at least until present are accepted as nonpartisan necessary features of most OECD governments, you’d kind of want that for behavioral economics. Do you see that happening let’s say in the U.K. or throughout other OECD countries?

KIRKMAN: Yeah, I think so. And another point that I would kind of—always kind of like to return to on this is that I think it’s very easy to lift the behavioral economics conversation into this sort of level of whether it fits with various different political ideologies and these kinds of things. But frankly, when we kind of embed it in departments, within organizations, a lot of the time, really, you’re not talking about things that need to be kind of dissected and discussed on a kind of ideological level. You’re talking about, are we all right with changing this form and then testing whether it works better or—

You know, and if you sometimes take a little bit of the kind of, you know, exciting, shiny kind of headline-grabbing stuff out of it, really—there’s a really great story that a professor from the Rotman School in Canada talks about which I just—I just love where he was trying to work with a government and get them to change this form, and there were all of these kind of these blockers and barriers—gave up on the project, rang someone up six months later and said, oh, did you ever change that form? They said, yeah, we just changed the—changed the form. He’s, like, wow, did you get all to sign off? And they said, no, no, the printer broke, and it turned out that it printed these specific dimensions, so we had to change the paper. And when we changed the paper, we just changed the form.

And it’s, like, you know, talking about it in terms of changing the paper and the form, totally noncontroversial; talking about it when you’re kind of discussing, oh, you know, ideologically, are we OK with behavioral science? So I think there’s also something pragmatic about, you know, if we’re not sort of egotistical about this and we just say, actually, we’re just trying to find out what works and, you know, we’re just trying to kind of make these small tweaks to the way that we’re delivering services, and over time they add up, then I think we can kind of, you know, walk back from that ledge a little bit.

KARABELL: Well, from the sublime to the mundane.

We have some time for questions, of course. Please identify yourself and—sir.

And I guess wait for the microphones which are converging on you.

Q: Jeff Shafer, JRShafer Insight.

I have a question that goes off in a little different area about how government behave rather than the people governments deal with, but I’m hoping you can shed some light on it anyway.

The most astounding thing that I have read out of the behavioral decision-making literature is that committees make lousier decisions than the people who make them up if they make their decisions in isolation. And I’m wondering, is that still kind of an accepted view within the—in the field? And if so, what kind of thinking is going on about how people can make collective decisions more effectively than they can in this groupthink environment of a normal committee?

KIRKMAN: Yeah, we’re weaker than the sum of our parts I think is the kind of summation of that.

Like all behavioral stuff, this really depends on contexts and on the way that you’re kind of managing things. I’ll tell you one very straightforward thing that we did to try and mitigate against this internally is we use a method called thinkgroup, which is, you know, an inversion of groupthink. And the idea is that when we’re trying to make a decision by committee, whether that decision is on, you know, maybe what direction we should go in in terms of designing a new policy recommendation, whether it surrounds performance appraisals and promotions and those kinds of things, you have a kind of shared document that everybody logs into. Everybody is incognito, so you don’t know, you know, if it’s the boss of if it’s the intern that’s kind of contributing an idea. And you just engage and you spend a silent half-hour kind of laying down some base kind of ideals that way.

And the benefits of this are that you get many, many more ideas than if the highest-paid person in the room just opens their mouth and everyone kind of anchors to that. And you also get ideas that are then assessed based on merit and not influenced by other people, and people maybe say things that they either wouldn’t be comfortable saying or that they just get shouted out because they’re more introverted or, you know, they’re just not feeling like kind of taking someone on that day or whatever it might. So there’s a bit of a kind of a day-to-day example.

But I think we should absolutely guard against this. And we should be very thoughtful about the way that we make up groups making decisions because there’s also research that shows that who is in that committee clearly matters, and particularly when we’re thinking about homogenous groups and more diverse groups but also when we kind of diffuse those group dynamics.

KARABELL: Sir. And then we’ll go back there.

Q: Thank you. I’m Alex Jones. I’m associated with the DailyChatter, which is an international news daily digital email newsletter.

The former prime minister of Britain called for a referendum on Brexit with the expectation that it would fail. The new prime minister has now called for early elections with the stated of purpose of bolstering the support for Brexit. Where do you infer this behavioral psychological economic policy was baked into this decision to call early elections?

KIRKMAN: I would love to answer this question, but I couldn’t possibly comment. It’s interesting and we’ll see how it plays out. (Laughter.)

LIEBMAN: But let me—let me—I’ll try to save you. I think—I have seen many examples where elected officials have exhibited the exact same behavioral we run into when people, you know, are so present-biased that they don’t save, but, you know, where they worry about how to get through the next three weeks’ problem and by doing so create a much bigger problem 18 months later.

And, you know, I think, for example, you know, there was one point where in order to get votes on a particular budget resolution, we committed to creating what became the Simpson-Bowles Commission, which then when it came out with its recommendations created all bigger question about how to respond to that. And, you know, it successfully got us the votes needed early, but then it created an even bigger problem 18 months later when the president chose not to—well, had to engage with those recommendations. So I see that kind of behavior just over and over again in government where, you know, you do something to get through this week, and you—boy, do you have a bigger problem a year later because of it.

KARABELL: It’s an interesting—I mean, we will see, right, whether or not you get the inverse reaction to the same expectation. But this one looks a little less uncertain, although I’m sure we can be sitting here in three months and have a completely different reaction.


Q: Angel Liswo (ph) from JPMorgan.

Guilt can be a very powerful behavior and psychological agent. And I read some studies pointing out that a simple text message to public school parents can lead to extremely higher levels of engagement, parent engagement and high academic achievement and scores by the children. How do initiatives like that that looks like a no-brainer—it’s not red or blue and doesn’t cost much as opposed to several other initiatives that cost several millions of dollars—are not being—(off mic)?

KARABELL: Maya, you want to take a crack at why that doesn’t easily happen within government?

SHANKAR: Well, I think there’s just inertia effects. So it’s hard to deviate from the status quo when my government agency colleagues are not rewarded for taking risks and are basically just trying to do the job that they’re being asked to do. So sometimes introducing these new initiatives, as obvious as they may seem or as effortless as they may seem, ends up being more complex from a bureaucratic perspective and also because we might not have the apparatus for actually doing that thing.

So for—in the example of text messages, I was working with my colleagues at the Department of Education for years trying to figure out if we could text FAFSA filers and try to get them engaging in certain behaviors, and it turns out that the government texting people is a very complex thing to do, right? Who knew, right? But we had to figure that out along the way.

So, I mean, I think it’s a good thing that a lot of these interventions do seem like common sense retroactively. I think it helps to build public buy-in for some of these things. So, for example, as Elspeth was saying, you know, when I first joined, a lot of pushback from the conservative media—I was called, like, the nation’s control freak, the know-it-all 27-year-old at the time, thinks she knows how to run people’s lives—which I do actually think. No, I’m just kidding. (Laughter.)

But I think that one thing that kind of assuaged people’s fears is that when they actually see it done, both the things that worked and the things that didn’t work, they all were so obviously benign, right? So these interventions involved sending text messages to low-income students trying to get them, you know, matriculate in college, trying to get, like I said earlier, service members to save for retirement, trying to get people who had started the health care enrollment process to actually finish the health care enrollment process. And so I think the fact that, like the gentleman noted, when you actually hear about how effective some of these very light-touch, low-cost interventions are, it’s very easy to get buy-in from the public.

So, to summarize, I would say it’s actually just a lot harder to move a bureaucracy towards obvious things. But when you do in fact do those things, it does breathe a lot of trust in the public. So you kind of have to always have the long game in mind in terms of your ambitions.

KARABELL: So it’s funny, I mean, sort of back to that initial question about the government part of this equation or the bureaucracy part of the equation, a lot of the work has been focused on how can we deliver policy outcomes more effectively from government to publics. Has there been any work done on how we can more effectively allow or help nudge bureaucracies to adopt change for these outcomes?

KIRKMAN: Yes. (Laughter.) I think on the kind of hard-to-measure things, I think this is—this is one of them just because once you kind of start to infuse the spirit of this stuff and start to get people to do these things within departments, within administrations, it kind of catches on like wildfire, and you very quickly get lots of—some spillover. So it’s hard to measure how much you’ve kind of impacted it.

But we have—you know, we see often as a legacy of the work that we do, whether it’s in central government or whether it’s in, you know, municipal local-level government, that when you leave, the work doesn’t stop, and it kind of continues, and it does become commonplace. And some of it is around getting people to try safe experiments, not experiments in the—you know, in the actual sort of scientific sense of the word.

But a good example is, on that text message point, from the outside, that seems super straightforward and if you assumed that a school did have all the texting mechanisms and those kinds of things. Some of the qualities of research that we do behind studies like that shows you that, you know, if you’re actually a teacher that’s kind of the front line of this stuff and you’re having to live and breathe, you know, angry parents kind of coming in and saying, why are you texting me, you know, telling me how much my kid’s been absent relative to other people, or why are you telling me I need to talk to my kid about king penguins, or whatever we’re kind of texting them to try and get them to engage in their child’s education a little bit more, it is kind of—it does feel like a big risk to them. And all they can think of is the scenario where they annoy one parent or they cause our parent of a really disadvantaged child to say, you know what, I’ll just go to a different school and increase the amount of turnover that that kid experiences. And that success is measured on. And once you get them to kind of do a small thing like that, then maybe next time it becomes a little bit easier, and they can overcome they inertia because their counterfactual is no longer, oh, God, everything is going to go swimmingly if I don’t do this and horribly wrong if I do. It’s, you know, last time I did something that changed, it was really a factor.

KARABELL: Other questions?

SHANKAR: I think there’s also been a surge—oh, sorry.

KARABELL: No, go on.

SHANKAR: OK. I was going to say there’s also been a surge of really interesting research coming out of human resources, so a lot of work in HR trying to look at what motivates employees, how do we get them to feel safe taking risks. So, for example, the former head of HR at Google, Laszlo Bock, wrote a book called “Work Rules!” And it was basically chronicling years of research figuring out what motivates employees to feel good about their work, to feel invested in it, to be interested in innovation. And so I think that the public sector can probably borrow some of the insights from the private sector in this particular domain to figure out how—you know, I think what Elspeth was saying, you sort of, like, chip away at the problem. So you might not be able to move the whole bureaucracy all at once, but if you can change individual minds within the system and try to encourage small behavior changes, in aggregate it might have some pretty pronounced effects.

Q: Hi. Jonathan Klein, Getty Images and a few other hats.

Maya sort of helped me with the last comment she made. The private sector hasn’t been mentioned at all during this session until about 30 seconds ago. When I look at the examples that you have all come up with, with changing forms and putting signatures in different places, the private sector, especially in the e-commerce and online space, have been doing tons of this forever. In fact, 10 years ago various companies could do A/B testing and multivariate testing with 12 million consecutive and simultaneous tests. To what extent has the private sector been helpful to the government in providing technology and know-how and the ability to essentially test a great deal more? And has the public sector had the technology in terms of the machines as well as the people to get as far as quickly as the private sector?

SHANKAR: Well, I can start off with one answer, which is I think we’re certainly technologically behind, so our ability to do rapid A/B testing with multiple treatment arms is actually quite challenging. Oftentimes we’ll get a few arms at best. And we have to make sure that our treatment arms are really driven by hypotheses that are compelling and meaningful.

I will say that—I mean, I often get asked this question of, like, well, the private sector’s been doing this for years, and now the public sector is catching up. I do think that they are categorically different environments and that there has to be a strictness that comes along with experimentation and the application of behavioral science in the public sector that maybe the private sector can be less concerned about. And that’s because, I mean, we’re dealing with 300 million Americans. They’re not a testing bed for experimentalists. And it seems like we need to be more judicious about what it is that we test. We have to be more judicious about transparency and making sure that the public is informed about the things that we are doing so that we can have a public conversation about what people in this country are comfortable with or not comfortable with. And so disclosure is really important in this—in this instance. And I think we have to be very, very thoughtful about the ethics behind what we’re applying and what we’re testing.

And so I think it’s not simply a matter of us sort of catching up to the private sector as much as creating a new set of ground rules that we operate within to ensure that we are protecting people and are never taking advantage of the platform.

KIRKMAN: I think one thing I would add to that as well is that there are a number of examples of things we’ve done, whether it surrounds collecting tax revenues or whether it surrounds trying to incentivize people from underrepresented groups to do things like apply to join the police force and these sorts of things, where you find that the message that works best on average is not the message that works best for, you know, the most vulnerable or the most kind of underrepresented groups. And if you’re in a private sector business and all you care about is kind of, you know, the conversion into dollars at the end of it because that’s your kind of mandate, then it’s much more straightforward because you just need to care about what the kind of best result is on average whereas where you’re kind of thinking about this from the government perspective, you also have to kind of play the equity and access kind of aspect of it and put those into play and make some tough decisions that can end up being sort of fairly political in their flavor about what you do and don’t want to kind of pursue as a result of these tests and whether you want you want to kind of subsequently segment and treat people differently depending on what appears to work best for different groups. So I think all of those things plus the inevitable logistical hurdles that we definitely face, the private sector don’t really kind of come into—come into the fore.

KARABELL: Yeah, I mean, that’s a—I think it’s a good juxtaposition. So private sector, many more tools, a lot more funding and probably better technology but not always trying to answer the same questions. And so they’re probably applicable methods, but they’re not always—they don’t always stand one to one.

Final question.

Q: Robert Klitzman from Columbia University. Thank you.

I’m just wondering, as a follow-up, the government obviously can’t randomize citizens into different groups experimentally but can fund studies using social science. It can look at some of these issues. And I’m wondering if you think there is sufficient funding for such studies, which agencies would do it, what might be done, any hope for that happening.

KARABELL: Did you ever allocate in OMB for—

LIEBMAN: Yeah, I rarely see funding as the constraint for finding out the answer the something. It’s much more, is there someone within government who will authorize the activity, give you access to the data or to the platform. You know, more funding for researchers is a good thing, at least for people like me who are researchers. But I think—I don’t think that’s the problem in most cases. I think most of the cases is getting the ideas out there and finding a champion within government who’s willing to try to get better results with some of these techniques.

KARABELL: Actually, so let’s squeeze in one final question. The lady here had raised her hand, so—

Q: This is Sinem Sonmez. I’m an academic. I’m a professor of economics at St. John’s, and I taught for almost a decade now.

And my question is regarding accountability with government agencies, in particular with the Department of Education. For instance, you know, since I’m a professor, I get observed, both by the chair and I also get evaluated by the students, so there is—I’m being—you know, I’m being monitored at both ends. However, when it comes to the Department of Education, for instance, shouldn’t there be more accountability so that we avoid the sort of problems that we have had with the student loans given the fact that Department of Education—Navient—Department of Education was pushing loans to Navient, and then Navient was listed on the stock market, which in turn then—the incentives were not lined up. So Navient’s incentives was to increase the shareholder price. And by doing so, they were not really trying to give various alternatives for student loan payments to students. And then, you know, there was a student loan default.

So could we—could you please give us some reasons as to how we can increase accountability within the government to avoid sort of the problems we’ve had with Navient and the Department of Education and the student loan defaults and all that? I don’t know if you’re familiar with the situation, with Navient and the fact that the incentives were not lined up correctly.

KARABELL: Yeah, no, I think we’re—I mean, is this amenable to the kind of techniques of behavioral economics as they’re currently applied in government, or is this more of a regulatory policy issue from—

LIEBMAN: So, you know, tying this question in with a little bit about the one about how the private sector and the public sector are different is, government often has to be pretty democratic in who it allows to compete for business through a government program. I mean, think of the problem of letting consumers choose among health insurance plans. If you’re a firm, you might choose only one health insurer, or you might choose two or three, but you’ll have a very curated, narrow set of choices you’re going to give to your employees such that no matter what choice they make, they’re going to get a pretty good outcome. When we have the government set up exchanges for health insurance, we tend to give people 20 or 30 options because the government sort of has to let anybody who sort of meets eligibility criteria compete. When we option for Medicare Part D insurance for prescription drugs, you know, there are dozens and dozens of plans out there, and studies have shown that consumers, seniors who are buying Medicare Part D, often make the wrong choice and don’t choose the plan that gives the best, cheapest coverage for the drugs that they’re actually consuming. And so there is a challenge here because, you know, government can be really prescriptive and say, here is the best plan for you, but we don’t really actually feel comfortable with that. We actually let everyone compete for these types of systems that the government mediates.

And, you know, I think student loans are sort of an example of that. It’s clearly an area where we could be doing a lot better in giving the consumers information about which of the educational institutions are actually—what the graduate—you know, what the subsequent employment rates and earnings rates are—earnings levels are for people going through different options. And there certainly have been proposals to condition the eligibility to be part of the student loan system to actual having good performance. And so that’s—I think it’s an area where we’re going to see better use of data over time. But I think it’s also an example of this broader problem that government sort of has to be pretty fair and let anybody who wants to compete for the government—for the business that is financed by government programs do that.

KARABELL: Well, I think we’re at the end of our time. I want to thank Maya and Jeffrey and Elspeth for a fascinating discussion. Obviously, let us hope this is embedded enough in policymaking throughout the world such that we can revisit this year by year with the Menschel Symposiums and others. If not, it’s been a fun ride for the past few years—(laughter)—and vaya con dios. Thank you very much. (Applause.)


Top Stories on CFR

Genocide and Mass Atrocities

Thirty years ago, Rwanda’s government began a campaign to eradicate the country’s largest minority group. In just one hundred days in 1994, roving militias killed around eight hundred thousand people. Would-be killers were incited to violence by the radio, which encouraged extremists to take to the streets with machetes. The United Nations stood by amid the bloodshed, and many foreign governments, including the United States, declined to intervene before it was too late. What got in the way of humanitarian intervention? And as violent conflict now rages at a clip unseen since then, can the international community learn from the mistakes of its past?


The IMF and World Bank’s spring meetings will focus on the prospects for a soft landing after years of global economic turbulence. But major challenges remain, including growing climate finance needs and persistently high global debt levels.

South Korea

The center-left Democratic Party added to its legislative majority after the recent parliamentary election, which would deal a blow to President Yoon Suk Yeol’s domestic reform agenda and possibly his efforts to improve ties with Japan.