Moderating Online Content With the Help of Artificial Intelligence

Wednesday, November 14, 2018

Max Taylor

Speakers

Robyn Caplan

Researcher, Data & Society

Tiffany Li

Resident Fellow, Information Society Project, Yale Law School

Sarah Roberts

Assistant Professor of Information Studies, UCLA

Presider

Joel T. Meyer

Vice President of Federal and Corporate Markets, Predata

This panel examines AI’s role in moderating online content, and its effectiveness, particularly with respect to disinformation campaigns.

SEGAL: (In progress)—real, I am not a deep-fake hologram—(laughter)—unless I say something wrong, and then I was totally manipulated and it was all fake news.

I just want to welcome you all to today’s program. This is the fifth symposium we’ve done, building on last year’s, which was on Russian election interference and securing the election infrastructure.

Please take a look at other Council products. Net Politics, we’ve covered a number of these issues. And last month, we published a great cyber brief by Bobby Chesney and Danielle Citron on deep fakes—Bobby’s going to be on the following panel—so hopefully you guys will take a look at that.

And I want to thank the D.C. staff, Katie Mudrick, Marisa Shannon, and Stacey LaFollette, and Meghan Studer, for helping us out. And on the New York digital side, Alex Grigsby, who helped put everyone together for these conferences.

I think great timing, you know, just a week after the election. The evidence, I think, on what we saw is pretty mixed. Some success from the companies on takedowns, but clearly, lots of disinformation happening, most of it seemed to have been domestic, not from Russians, but reports out that the Iranians are learning and adopting new methods. So I think we have a lot to discuss today and really looking forward to it.

Please try to stay all day. I think we’re going to have some great panels. And if you have any ideas, input to the program, please find me during the day.

So thanks, again, for everyone for coming and thanks to the panel.

And I’ll turn it over to Joel now. (Applause.)

MEYER: OK. Thanks very much, Adam.

And thank you all for joining us this morning.

Welcome to the first session of today’s symposium, titled “Moderating Online Content with the Help of Artificial Intelligence.”

We’re joined by three terrific panelists today. First, all the way on the far side there, is Robyn Caplan, researcher at Data & Society. Then we have Tiffany Li, who leads the Wikimedia/Yale Law School Initiative on Intermediaries and Information. And next to me, we have Sarah Roberts, assistant professor in the Department of Information Studies at UCLA. I should point out, her book on commercial content moderation is coming out in June. It’s called Behind the Screen: Content Moderation in the Shadows of Social Media.

I’m Joel Meyer, I’m vice president at Predata, a predictive analytics company. And I’ll be moderating this discussion today.

So I’d like to start off with Sarah, if we could. What is commercial content moderation? Who does it? What challenges is it trying to solve?

ROBERTS: Thank you. Good morning, everyone.

So with the rise and the advent of these massive social media platforms, the practices of adjudicating and gatekeeping the material that is on them has really become something that has grown to industrial scale. And these practices have their roots in, you know, early forums of self-organization and governance online when users like me, celebrating my twenty-fifth year this year on the internet, used to argue vociferously about what should stay up and what should be taken down. But the difference was that those decisions were often very tangible and present.

When social media became a powerful economic concern, and certainly even more so a political one at this point, it coincided with these practices stepping to the background. And at the same time, with the amping-up of scale, the massive scale of user-generated content that flows onto these platforms, the companies that owned them, as you can imagine, certainly needed to maintain some sort of control. The primary impetus for the control, I would—I would say, is actually one of brand management.

So in other words, most people wouldn’t open the doors of their business and ask anyone in the world to come in and do whatever they want without a mechanism to sort of deal with that if it runs afoul of their standards. It’s the same thing here, the only difference, but the key difference I would say, is that the business model of these platforms is predicated on people doing just that, coming in, expressing themselves as they will, which is the material that drives participation.

So this practice of commercial content moderation has grown up to contend with this massive, global influx of user-generated content. And what it means is that these companies who need it, from the very biggest to even small concerns that have social media presences, typically need a cadre of human beings—and also, of course, we’re going to talk about automation—but human beings who review, adjudicate, make decisions about content. That seems like a benign sort of thing when we put it that way, when we use words like “content” and “adjudication,” but as you can imagine, that quickly becomes a huge complexity.

MEYER: Right. And I’d like ask Robyn to jump in here because not all these platforms are the same, right? We tend to use this blanket term “social media” to cover platforms on the internet that really serve very different purposes and operate in very different ways. So how do different platforms handle this challenge of commercial content moderation.

CAPLAN: OK. And this is—I guess I’ll do a plug for our report. We have a report coming out from Data & Society today called Content or Context Moderation? Artisanal, Community-Reliant, and Industrial Approaches. And what we did was we did an interview study with policy representatives from ten major platforms to see how they do policy and enforcement differently across.

What we found was that platforms, first, they are trying to differentiate themselves from each other in particular ways. So they are working to understand how their business model is different, their mission, and the scale of the company.

And so we found that companies are really different from each other in terms of their content moderation teams. There were three major types that we found. The first we referred to as artisanal because that’s a term that’s actually coming from the industry. And this is the vast majority of companies. These companies do not have tens of thousands of workers like Facebook or Google do. They have maybe four to twenty workers that are all—that are doing all of the content moderation policy and enforcement and often within the same room, located within the United States. So these are incredibly small teams. At the same time, these companies are serving millions of users, but they may be able to get away with having some small teams because they have fewer flags or needs, in terms of content that needs to be moderated.

That is vastly different from the industrial organizations that we normally speak about, the Facebooks and Googles. Those companies have tens of thousands of people, often that they’re contracting offshore. There is a huge separation in terms of policy and enforcement, so policy is typically located within the U.S. or Ireland, enforcement can be really anywhere in the world. And that means that the rules that they’re creating are different, so they are often relying on kind of universal rules, whereas for the artisanal companies they’re looking at things from a case-by-case basis and building rules up from there.

At the same time, Facebook, according to our interview subjects, actually started off as an artisanal company. They started off with twelve people in a room and that’s how the initial rules were set.

And then the third type is the community-reliant, so this is Reddit and Wikimedia, where there is basically kind of federal model. There is a base set of rules that are set by the company, and then there are—can be hundreds of thousands of volunteer moderators that are able to establish rules in their own subcommunities.

So this is kind of the main way we’re trying to differentiate between these companies. These companies are also working to differentiate themselves from each other, mostly because they’re trying to establish themselves as not Facebook, as there is, like, another threat of regulation coming down.

MEYER: So, Tiffany, I’d like to ask you to dig a little deeper on one of the points that Robyn just mentioned, which is that these are global companies, they operate around the world, most of them, many of them do, certainly the big ones, but different countries have different views that are reflected in their laws and regulations on what free speech protections are available, what type of speech is regulated, and how. So what are some of those key differences? And how are social media platforms handling that in their content moderation practices?

LI: It’s a really great question. So I think when I think of content moderation issues globally, there are a lot of challenges, but there is also a lot of room for opportunity. So on the challenge front, the main issue, I think, for tech companies or the tech industry is simply the fact that the internet is global.

So if you are Facebook or if you are a small startup that allows people to host content, you have to comply with basically every single country’s regulations and these can differ on a wide-ranging scale. You can have regulations from the U.S., for example, which very much protect free speech and free expression. You have regulations in the EU that have very strong laws on things like hate speech or extremist speech that require very quick or fast content takedowns. And then you have governments in illiberal regimes which often request takedowns of politically sensitive content or other content that we may think should be allowed under general principles of free speech.

So it’s very difficult for companies to navigate this, but there is a lot of opportunity as well. I think there’s a lot of opportunity for collaboration. We see this with companies working together either within industry or with industry and with government for projects like taking down and tracking extremist content and extremist movements online. This is a great opportunity for people to really collaborate and, I think, really make some effective change.

Broadly speaking, I think the most important thing—and I’m sure you’ll all agree—is that the promise of the internet was to allow for online speech. And on a worldwide level, there are still many places where access to information is restricted or online speech and online free expression is restricted. So all of these companies right now have the opportunity to create global speech norms through their content moderation practices. And I think that’s an opportunity that we can’t forget still exists, even if there are a lot of problems and even if there are issues with conflicting laws around the world.

MEYER: And how do different platforms actually operationalize their expansion into these new markets? There have been some notable stumbles, but also, as you point out, a lot of progress being made. Can you give maybe some examples?

LI: So I think one of the first stumbles I think of is the issue of localization.

And I think, Sarah, you’ve written about this, the problem of finding moderators for every single language and for every single region or nation.

So there is—there are types of content or types of speech that may be problematic in one region that someone from the U.S. would never understand. For example, likely the moderators who are based within the United States, the few of them who exist, may not understand slang in the Philippines, for example. So they might not understand what would be considered harassing speech in the Philippines versus in the U.S. So one issue is really just scale, right? Having moderators of every single language in every single region who are able to understand the local norms and the local languages.

MEYER: Robyn, is this a challenge that some of the smaller artisanal companies are able to manage? And if not, what is their solution?

CAPLAN: So I think this is a challenge that all of the companies are struggling to manage right now, even the bigger ones. So Facebook and Google by no means have moderators in every single language with the cultural understanding to be able to understand what’s going on in that region. In many cases, they don’t have offices in every area that they’re operating in as well.

For the smaller companies, this is a problem that’s insurmountable. They are located primarily within the United States, if not—actually, I think all of the companies that we spoke with, all of their workers are located within the U.S. But what they do to be able to expand their capacity is they establish relationships with academic institutions and NGOs in different areas that they’re working. These are often informal relationships, so they are going to these academics and presenting them with a problem. Mind you, they have a much smaller set of complaints and concerns to deal with so they can do this, but they are reluctant to formalize these relationships. And that might be something that they need to do if they continue operating in these regions.

MEYER: So, Sarah, Robyn noted that for the artisanal approaches, the content moderators are mostly or maybe wholly located here in the United States. Obviously, that’s not the case for some of the bigger ones. Can you describe some of the ways that, at the more industrial scale, these companies approach content moderation, where it’s located, and what are some of the issues that that gives rise to?

ROBERTS: So even at the—you know, again, I guess we’ll just call them the Facebooks and Googles because that’s what we’re all thinking about when we think of the big guys. Even in cases for these very large firms with massive global footprint or perhaps even more so the issue of labor, adequate labor, is a problem, and so they utilize a strategy of putting together sort of hybrid solutions to attend to that, which means that they may have people in the U.S., and they typically do, and the work sort of gets broken out or triaged in some way based on its complexity perhaps or the perceived nature of the problem with it.

And then this strategy of employing people around the globe in a variety of ways is brought to bear. So it may be that these are full-time, full employees, direct employees of the firm sometimes who are doing this work, but more typically they’re contractors, whether they’re contractors residing in the United States or contractors somewhere else in third-party call center-like operations in places like the Philippines, India, Malaysia, and other places. Ireland as well was mentioned.

But also, you know, even today, you will find instances of the very biggest companies relying on other forms of labor, such as the micro-labor platforms like Amazon Mechanical Turk or Upwork or some of these other—these other platforms you may have heard of. And what’s very interesting about that is that, you know, there are sort of these secondary and tertiary content-moderation activities going on now, too, in addition to frontline decision-making.

One thing that some folks reported to me, who primarily do their work on Mechanical Turk, is that they started getting these seemingly preselected datasets of images to adjudicate and what they realized they were doing was actually training machine—doing machine-learning training on datasets, which would then be fed back into systems that would ideally automate some of that work. So not only is there a hybridity around how the labor is stratified and globalized, but there’s also this hybridity and interplay between what humans are doing and what the aspirational goals for automation are.

MEYER: Well, that’s a perfect segue to start considering how artificial intelligence can help with this challenge. We’ve just covered a lot of the challenges and complexities that are involved in this.

I think, Sarah, you mentioned the dramatic growth in the scale and the velocity of user-generated content on these platforms. You know, to me, that certainly screams out for some kind of artificial learning—artificial intelligence, machine learning-type of technology to be applied.

Robyn, maybe you could start out by helping us understand, in what ways is AI currently being used in this challenge? And how effective is it?

CAPLAN: OK. So AI and machine learning is currently being used for three different content types in particular. So the first is spam, the second is child pornography, and then the third is what’s referred to as kind of terrorist content.

For the most part, every company is using automation of some sort. The smaller companies will use automation around spam in particular and child pornography. For the larger companies, they’re really using it around these areas.

The smaller companies are limited in their use of automation in content types beyond that. So as companies try to tackle issues like hate speech and disinformation, the vast majority of companies we spoke with said that they do not use automation in those areas, that everything is looked at by a human.

For the larger companies, they say something similar, but we do know that automation is being used to detect these kinds of content. And we know that because of transparency reports that have been put out by companies like Facebook that have said that they’re using detection technology, but it’s with limited success when we have content like hate speech or harassment or even, you know, graphic violence, I think they’re actually using automated detection technology and they’re fairly successful with that. We’re seeing it in other—but it’s rarely used to remove content. So what you have is automation being used to detect content and then it’s going to human review.

We do see automation for other purposes, though. So for YouTube, this process of demonetization where they’re taking revenue away from some users that they see as not creating content that’s advertiser friendly, you see mass automation in this area. And then what you see is kind of these backwards reviews where creators then can request a manual review after their content has already been demonetized.

And what’s very interesting about this is that creators, even though there is very little chance that they’ll actually get their revenue back because most of the revenue is made in the first twenty-four to forty-eight hours, they’re fairly invested in doing this manual review because they want these models to be better. So they think that that information will then be used to train the models. So companies like YouTube seem to be much more comfortable using automation just to remove revenue than to remove content.

MEYER: Tiffany, what are some of the legal issues involved in companies not only being in a position to regulate speech in this way, but then kind of deploying artificial intelligence technologies in a way that may not be well understood by the users?

LI: So I think the first issue, generally speaking from the United States—I’m sure everyone here in the room knows we have a very strong culture of free speech and we have our First Amendment principles that are highly valued within the states. Now, this is not necessarily the case abroad. Europe, for example, although Europe has, you know, very strong democratic principles and obviously values human rights, like free expression, does not generally have the same free speech culture that we have, the free speech maximalist culture that the United States has.

So what happens and legally speaking is companies will use technologies like AI, base technologies to take down content sometimes to comply specifically with different laws and regulations. So I’m thinking right now about specific European laws that mandate things like removal of hate speech within twenty-four hours. That kind of very strict requirement often comes with legal consequences and fines.

So when companies are faced with this type of regulation that they may have difficulty complying with, they often turn to automated content takedown systems. And what happens then is you could argue sometimes a user’s speech may be curtailed by these systems. If a company does not have the time to have human moderators review specific flagged pieces of content and instead relies on automated content takedowns, then some argue that free speech for the users online is then limited or restricted. So that’s one issue I think that comes up with legal and regulatory problems with the use of AI and machine learning specifically in content takedowns.

MEYER: What type of accountability should there be? I mean, are there any examples out there that would be instructive?

LI: So that’s another great question. And I think that we were speaking earlier about this model that the founder of my research center has currently been promoting. This is Professor Jack Balkin of Yale Law School, if any of you are interested, and he has been writing about what he calls a triangle model of speech regulation, so three types of regulation of speech.

The first type of regulation is specifically government regulation of speech, which happens through the various legal processes that we already have in place. Right? And we have due process, we have accountability, we know what happens when the government tries to regulate a citizen’s or resident’s speech.

The second type of regulation that occurs in the online space is corporate regulation. So Twitter, for example, can take down a tweet of yours if it violates their terms of service. And we have some sort of accountability. As a user, you can complain to Twitter, you could, you know, pull your money out of investing in Twitter’s stock, if you’re a shareholder you have shareholder rights, and so on.

There’s a third category of speech regulation, though, that’s really interesting, I think, right now. And this category is when governments or state actors try to use private speech regulation to regulate residents’ or citizens’ speech. So what happens is a government agency will report or flag content that they believe is problematic and they won’t take it down through a formal government request system, which has accountability and due process. What they’ll do is, for example, they’ll tell Twitter this tweet is against your terms of service, you should take this down. And at this point, it’s effectively still the government regulating a resident or citizen’s speech, but leveraging the terms of service of a private institution. Now, the problem here is then that the user doesn’t really have due process, there isn’t really accountability or transparency in the way you would expect one to have for sort of government regulation of speech.

So these are the three types of speech regulation that we’re looking at a lot right now. From a legal standpoint, the first two types are legally somewhat agreed upon with what the standards are, but that third type is quite tricky.

MEYER: And I think that, to me—that’s a really fascinating scenario that these companies and users are being put in, but it also points to this challenge, Sarah, in the shifting standards and the challenges of what is takedownable, right? You know, what violates the terms of service and how does that change year to year or even week to week or even day to day or hour to hour, in some cases, and change across contexts? Is that—first of all, I think that’s a challenge for these companies, even aside from artificial intelligence. But then when you add in AI, is that something AI is prepared to deal with?

ROBERTS: Well, in a word no, I would say. But, you know, I think—I think to your point about the changing nature of what needs to be responded to, that is a fundamental characteristic of the landscape in which these practices are being undertaken, whether it’s via humans, whether it’s the machines that are doing it, or whether it’s this more typically an interplay between the two.

So any set of policies, whether encoded in an algorithm or whether written down and used for adjudication internally, they’re totally dynamic. And not only are they always in flux just by the nature of the firms themselves, but as we’ve seen more and more, these companies—and again, I’m talking about the big ones for sure—are being called to account with regard to breaking situations around the world, whether it’s sort of political unrest or, in some cases, targeted abuse of vulnerable populations, and so on. So obviously, that requires a certain nimble approach and it requires a mechanism by which to respond and adapt quickly. Of course, going back to the points from my colleagues that they made earlier, that also would presuppose having the adequate, knowledgeable staff to identify and understand those complex situations.

And since we’ve sort of established that often that isn’t even an offer for a variety of reasons, then I think we need to be thoughtful and perhaps even troubled by the notion that when we don’t even have the baseline of having that broad understanding of these issues, as well as the specificities when these incidents occur, how are we going to encode that and then automate that—automate that out? So it becomes—it becomes worrisome in that regard.

And the one other point I wanted to make about, you know, Robyn gave this great overview of sort of, like, what is AI good at, because it is—I think we—we’re, you know, we might seem like skeptics, but we can agree that it is actually good at doing certain things, and she gave those three categories.

I think one thing that I want to clarify there is it’s not—it’s not the nature of the content, as awful as those things—well, spam is bad enough, but child sexual exploitation material certainly and also this terroristic content that often shows extreme violence—it’s not the nature of the content that makes AI good at retrieving it. What makes AI good at retrieving it is the fact that, for better or for worse, much of that information is recirculated, so it’s, like, the same videos over and over again.

And so what the AI is doing is it’s really just a complex matching process, looking for the repetition of this material and taking it out beforehand. And that’s why, as both have said, things like hate speech, which are discrete instances of behavior that are nuanced and difficult to understand, cannot just be automated away.

CAPLAN: And that’s an incredibly important part.

I also want to point—I also want to note that for the representatives we spoke with, they actually expressed some reservations about the use of automation for content that’s referred to as terroristic. That in many cases, what is happening is that it’s kind of more politically acceptable to take down than leave up and so they’re not quite clear on how many false positives are in those databases. So that is a bit worrisome.

MEYER: So in the areas where AI is not currently being used or is not being seen to be effective, are you aware of any approaches out there that show promise, either in the academic space or research and development? I mean, are there examples out there that we can kind of hang our hat on and say maybe this is a good approach that we should look to?

CAPLAN: I mean, I think to Sarah’s point, any content that is recirculated is good for automation. And in some cases with hate speech and disinformation, we do see this process of recirculation, we see this kind of signal-and-noise dynamic between some publications and then bots or more fake accounts that are being used to recirculate that content. So that’s one area where automation can be used. I think it’s likely that it is being used to some degree to remove the tens of thousands of accounts that some of the major companies are reporting that they’re removing.

MEYER: So you think some major companies are using AI to actually take down accounts?

CAPLAN: For the fake accounts? I think it’s likely, but I can’t—I can’t say for sure, no.

ROBERTS: So I think I guess I’d make two points. The first is about sort of the state of the art and the second is a more philosophical comment.

The first would be to say, in addition to what Robyn described, you know, there are researchers in areas like natural language processing and information science who are looking at mechanisms by which—mechanisms by which a conversation might turn. So in other words, it’s like this predictive approach to understanding the nature of dialogue and conversations and sort of being able to create a flag, for example, when a moderator should step in and examine a particular threat or examine a particular area.

But, you know, the point that I want to make about automation and these computational tools is that when we’re thinking about major firms that reiterate over and over again that their primary identity is one of being a tech firm and their orientation to the world is to problem solve through computation, I think it should stand to reason and we ought to be cautious about the claims that they make of what those tools can do and, further, whether or not they’re beneficial.

Because, you see, when everything is automated or put through an algorithm or a machine-learning tool, accountability becomes all the more difficult. We already have difficulty with accountability and transparency, understanding what the firms do, and we’ve had reference to some of the transparency reports—which I think we can agree are getting better thanks to the work and efforts of civil society advocates and people such as my colleagues who press on that—but the firms wouldn’t do that on their own.

And when the—when the processes of adjudication of content, which, again, is a problematically benign way to think about it—we might call it the regulation of the behavior of all citizens of the world, that’s another thing we might—I mean, you know, six of one, right? But when that is, you know, when that is rendered within proprietary machines and algorithms that we cannot really hold to account or understand, you know, we ought to be cautious.

MEYER: I think that’s a great point. I mean, I kind of framed the question implicitly saying, you know, don’t we want more AI? You know, how are we going to make this better? But I think that’s interesting, yeah, right, yeah—go figure, right? (Laughter.) But I think it’s a great question, you know. Is that desirable, right? Is that the outcome we all want or—I mean, this is a very serious consequential responsibility that these companies have. Do we want AI taking over? Or for some types of things, are humans, human content moderation, is that actually where we want to be?

LI: So I think the first thing to understand is AI is not a magic pill. AI is not going to solve, at this point, anything on content moderation. It’s a useful tool for the categories that Robyn mentioned. I would also add intellectual property infringement.

CAPLAN: Yeah, forgot that one.

LI: That’s a great category where artificial intelligence systems can easily detect content that is infringing on, say, the copyright of the rights holder for, you know, a video from a movie or something like that. That’s a great use case for AI and machine learning.

But for most content issues right now, AI isn’t enough. And I think the danger here isn’t that companies may over-rely on AI. I mean, as a lawyer I think of the law and I think the danger is a lot of regulators think that AI can do everything. So you end up with laws like many of the laws—again, not so much in the U.S., but mostly abroad—where companies have to comply with these very short takedown timeframes. And while those may be well-intentioned, we all mostly generally agree that we don’t want hate speech, extremist speech, and so on online.

The problem is, those very strict, short time requirements mean that a lot of companies will end up over-censoring because they can’t rely on AI. If they rely on AI, the AI will just take down everything. So that’s, I think, one of the issues that I think of legally.

ROBERTS: And if I—if I may, the knock-down effect of what you describe is that, because those tools actually don’t exist and are less than ideal when they do, is that the companies then go out and they go on a massive hiring spree.

So the case that comes to mind I’m sure for all of us is that of Germany when the German NetzDG law went into—went into effect, and even prior to it, what did—what was the response from the major companies? To set up call centers in Berlin and hire massive amounts of people. And as you can imagine, doing commercial content moderation as a job is not fun. It’s actually you’re exposed to pretty much the worst of the worst over and over again, as a condition of your job you see what people are reporting, which is a queue of garbage usually, to put it mildly. And so there’s these other human costs of, like, massively increasing these labor pools of people who do this work, typically for not a great wage I might add.

CAPLAN: I mean, there’s another cost as well. As these companies try and scale too rapidly, they often end up formalizing their rules very quickly so they can do mass training. And that has consequences for trying to create localized, culturally sensitive rules as well because they’re often relying on these universal rules so they can kind of have the same consistent judgment across locations.

MEYER: That’s great. So I think we’re in a great discussion here. I know I personally have a lot more questions, but I’m conscious of the fact that we have great expertise and questions in the room here. So at this time, I would like to invite members to join our conversation with their questions. A reminder that this meeting is on the record.

If you’d like to ask a question, please wait for the microphone and speak directly into it. Please stand, then state your name and affiliation. Please also limit yourself to one question and keep it concise to allow as many members as possible to speak.

So with that, why don’t we get into it. Please.

Q: Thank you very much. Elizabeth Bodine-Baron from the RAND Corporation.

From a researcher perspective, content moderation done by private companies, the good side, removing content from the casual user, downside, removing it from the general academic world, the research world. How do we work with these companies to allow the people doing the research to, say, in extremist content and developing these AI algorithms and everything else like that, access to the content that is no longer available because it’s being moderated?

MEYER: That’s a good question.

ROBERTS: So, I mean, I’ll just start out by saying I think you’re certainly identifying a predisposition and an orientation to opacity and secrecy that is present in these firms. I mean, obviously, there are carrot-and-stick approaches to doing that.

I ponder a lot the notion of, how do we know what’s not there, right? I mean, again, if it’s removed, how do we apprehend it, how do we understand it? Certainly, a lot of people are concerned about that for a number of reasons, whether it’s civil society advocates, legal scholars, regulators and others.

And so I think there are several approaches. One is to try to suggest that these partnerships might help their business model. That’s one. One might be to mandate that this information be reported. That’s another. And then there’s anything in between.

But I think that even, I would say, in the last two years, the orientation of the firms towards their public responsibility around these issues has really changed. We could point to the 2016 election in the United States, we could point to Brexit, there’s a lot of things that might be—that might be at the root of that. All that to say that pressure has to be brought to bear on the firms.

And I think at this moment—and I hope that you will chime in, too, with your points of view—but my perception is that they’re attempting to get out ahead of more regulation by doing a better job of sharing some of this information through such things as transparency reports, takedown reports and so on. But it’s nascent.

CAPLAN: So there’s two major efforts actually already happening in this area, much to your point, Sarah.

The first is the social data initiative. It’s a partnership between Facebook and the SSRC. And they are making some data available to researchers. What that data is no one knows, they haven’t actually determined what that data is. It’s a matter of asking the right question and seeing if they give it to you. So there’s obviously some bugs to work out, but that is one area that it would be fruitful to go and pose the question and see if you get access to the data—and if you don’t, tell the world.

And then the second was just reported on this week, that French regulators have said that they’re working with Facebook to do some research on content moderation and Facebook seems to be cooperating with that as a way to, like, stave off regulation there.

LI: And I have to mention, generally speaking—so I’ve worked with many of these companies, as have all of you, and these companies generally are made up of people and most people don’t want to support terrorism, for example. So on issues like extremist content, there’s a lot of collaboration inside a sector, as well as with the public sector. So there’s a lot of very close collaboration with law enforcement, with national security, and on international security issues. So I think the issue of public academic access is one thing, right? That’s a little harder sometimes because you want content taken down very quickly.

But if you’re thinking about questioning whether or not, you know, companies take down content too quickly for, say, law enforcement agencies to be able to research and track extremist organizations, there have been a lot of agreements and I think very close work between the agencies and the companies to make sure that any content or user accounts are kept up for as long as necessary.

And sometimes people don’t realize this, so the public is unaware and people are upset, why is this Facebook page for this clearly extremist organization not taken down? And sometimes that’s because the FBI is tracking that extremist organization and we need to know who’s accessing that page. So there is a lot of collaboration on that front. But definitely, public academic research access is another question I think that’s very important.

Q: Good morning. Thank you very much for this interesting conversation. My name is Dan Bartlett from the Department of Defense.

I wanted to ask a question surrounding crisis situations or in extremis situations. Do you find that the social media companies are responding differently, say, to a Parkland incident where you have to do something quickly, potentially, to get in front of it? Is there any sort of, you know, quarantine and then review vice flag and then review later? Have you seen any thought being done in the private sector on those issues?

CAPLAN: So I think, in terms of what we’ve seen, there’s a whole research group at Data & Society that looks at issues of media and manipulation and these crisis periods. And what we’ve seen is basically a period of testing around this.

With Parkland, I do believe there was an SOS signal that was included within the Google search results. And I’m wondering whether or not that was used to quarantine content that was coming out.

We saw one response during the midterm election where Facebook did put together a kind of ad hoc strategy kind of war room to tackle issues like disinformation. So we are seeing these acute responses that I think are still going through an ongoing testing period. I have not seen any kind of major changes in terms of the systemic response.

ROBERTS: I would just say to that—again, gesturing at some of the things we’ve already spoke about on the panel—as you can imagine, because of the competencies that are staffed for, certain things are able to be responded to more quickly whereas, if a crisis were to be unleashed in, say, Myanmar, there is probably not adequate staffing at the firms themselves to even understand the complexities and nuances of the geopolitics going on with regard to, you know, these situations, much less even how to respond or how to appropriately triage it.

So the issues, again, are related to, in part, you know, the American orientation of the firms and their—you know, their significant limitations sometimes in those areas.

MEYER: Well, as a quick follow on to that, I think you make an excellent point about, you know, you don’t know where the next crisis is going to be and to have the experts and the staff on hand, ready to deal with it. Are they trying to be anticipatory, are they trying to look ahead and see what’s coming?

ROBERTS: I mean, as much as one can. I think one of the issues—and, Tiffany, I’ll go back to what you said about regulators’ expectations—I’m not here to apologize for the firms, far from it. I don’t think anyone would ever accuse me of that. But I think we’re asking a lot. We’re asking a lot. OK, look into your crystal ball and find out where the next global crisis will unfurl itself. I’m sure we have people in here for whom that is a full-time job in a different way.

So, you know, again, pipe all of human expressions through a very, actually, a very narrow channel called YouTube or Facebook and call that behavior and expression “content.” We’re really—we’re actually quite limited in terms of maneuverability, in my opinion. And part of that is fundamentally about the business model of these platforms which have, you know, on the one hand greatly succeeded, but on the other hand have really painted the firms into a corner in a way, in addition to the expectations and demands that we now all have on them.

CAPLAN: I mean, the other part of that as well is that when we’re talking about disinformation, there was this understanding a couple of years ago that this was the work of individuals online, people organizing online. And what’s come out over the last couple of years is that this is largely, in many cases, the work of governments or the work of political campaigns, the work of military actors.

So to Sarah’s point, we are asking a lot of these platforms. We’re asking them not just to mediate between themselves and their users or users and users, but between governments and other governments, which is a huge job.

MEYER: Yeah, that’s a great point.

Q: Greg Nojeim, Center for Democracy and Technology.

What’s a good solution to the problem of government flagging? And here’s the scenario: The legislature passes a law that says this is the information that can be censored by the government and it says if the government wants to censor the law—wants to censor the information, it has to give the person the right to go into court to have an adjudication. The government gets around that rule by flagging content and having it taken down as a violation of terms of service. What’s a good solution to that problem to make it so that government can’t do that and is accountable?

ROBERTS: All eyes on Tiffany. (Laughter.)

LI: I will solve it all.

So the first solution to that is transparency, which I know we always say is the solution to everything, but here we literally don’t have the numbers, so we don’t know how often this happens. We know it happens, but we need, as academic researchers, we need the numbers on how often government agencies request that certain things are removed, not for legal mechanisms, but through these terms-of-service flags. How often do agencies or states use NGOs, for example to report things? We need those solid, concrete numbers so we can actually say that this is an issue or that this merits a level of consideration where we should have some sort of legal change or at least some sort of policy change. So I think that’s the first thing.

Secondarily, though, this sort of indirect regulation of speech is, I think—if it—if it gets to the level where it’s widespread enough that we need to—we think we need to change this legally, that would be an interesting area for possible litigation, if not at least policy discussions around how we think that the government should be able or should not be able to do this.

I think especially within the U.S., we have this very strong First Amendment culture and this seems a little contrary, depending on how you view different lines of cases about free speech protections and so on. So I think that’s a really interesting area of law that we might see changing in the future.

But the first step is just to have the numbers. We need the numbers, we need actually the facts on the ground. And then we’ll see what happens from there.

Q: Hi. Bob Boorstin from the Albright Stonebridge Group, formerly of Google.

I just want to complicate something for Tiffany, and that is your third category where you said that government and state actors are using requests to private companies kind of quietly. Look at the motivations for where the governments or rather the sources for where the governments get their information and you’ll discover occasionally that it’s the competitors to the companies that they’re going after.

I guess my question, following on Greg, is, give me three things, one from each of you, that these big companies should be doing, aside from transparency, which we’ve heard of over and over and over again.

LI: OK, so just only one thing from each of us.

CAPLAN: Only one? (Laughter.) All right. Expanding resources—these companies should at least have offices, for the major companies, in every area where they’re operating. And they should be hiring enough people with the language and cultural capacity to be able to moderate content in those regions.

MEYER: Is that feasible for smaller companies, though?

CAPLAN: He said major companies. (Laughter.)

MEYER: Aha, right.

CAPLAN: Smaller companies? No. Smaller companies—developing more formal relationships with academic institutions and NGOs in those regions.

LI: I would say, if just one thing, I would want consistent policies. Right now I think we have a lot of companies who are really trying very hard to manage these issues—extremist content, hate speech, and so on—but what that means is every three months Twitter has a new policy and say, oh, now we’re removing this type of content, now we don’t allow this type of account. And this constantly changing type of policy is, I think, very confusing for users and removes a lot of levels of what we could call due process for people who might have their speech or their user account suspended. So consistency, I think, would be wonderful, at least maybe six months, not just two months.

ROBERTS: I would—I would put a pitch in for the human workers and an improvement in their work lives and status to include things like valuing the work that they do as a form of expertise and then giving—I mean, not to use the word “transparent” again—but to give those workers—bring them into the light, essentially, so that we can value the work that they do.

Just to quote one of the people that I’ve talked to over the years who said, “The internet would be a cesspool if it weren’t for me”—direct quote. I don’t want to swim in one of those, so I appreciate the work that they do.

The other piece that goes to improving working conditions for commercial content moderators actually is a benefit to us all. So, you know, actually ten seconds to review a piece of content and decide whether it’s good or bad is not adequate, it’s not appropriate, but that’s what we’re asking these people to do. No wonder we have a muddle on the other side.

MEYER: Yeah.

Q: Hi, good morning. Brian Katz, international affairs fellow here at CFR.

A question for all the panelists, but sparked by a remark from Tiffany earlier on in her presentation. You had mentioned that social media and tech giants are essentially playing a critical role in the establishment of global norms when it comes to—through the course of their operations and the scope of their operations, really establishing these norms of what is free speech. So this is more of a philosophical question. And obviously, every company is different. But do you think they understand this? Are they grappling with the implications of this role that they’re playing? Are some embracing it out of either some corporate responsibility perspective or dare I say their own prestige and ego? Or are some trying to avoid it from trying to avoid some type of normative role when it comes to curating free speech in society? Thank you.

LI: That’s a great and very difficult question.

ROBERTS: Is that—is that one for a beer later today? (Laughter.)

LI: I mean, I don’t know if any of us can speak on behalf of the companies. I do think, generally, again, companies are made of people, right? They’re made of, of course, the directors and the employees and companies have their own cultures. So we have, I would say, some organizations—previously I worked for Wikimedia which is the foundation that runs Wikipedia, and that organization, for example, has a very strong culture of promoting values of free-access information, free speech online. Organizations and platforms like that really care about those missions.

And even some of the corporate giants that we talk about, sometimes a little dismissively, I think also have some of those values in play, if for no other reason than that primarily the people involved within those companies are coming from the U.S., they were raised with the values of free speech and democracy generally and you see that, I think, brought up very often. Companies, for example, like Twitter often publicly grapple with these questions. And Jack Dorsey often tweets about these questions, about Twitter’s responsibility. People on the trust and safety teams there, the legal team, and the policy teams there often talk very publicly about how they’re trying to deal with their responsibility for protecting democratic values globally.

The flipside of that, though, is, of course, that they are businesses and businesses enjoy being able to operate in multiple markets. So sometimes that means having to deal with conflicting norms and that is a very delicate balance. So, you know, I am not the CEO of any of these companies, so I can’t make these decisions, and it’s easy from a civil society perspective to say, yeah, just go and protect free speech, that’s all that matters. But there are often other competing interests at play.

ROBERTS: Yeah, hear, hear! (Laughter.)

CAPLAN: I would—I would actually complicate this question a little bit and say that most of the companies have been moving with a limited-restriction model of speech for the last long while. So we saw this very, very early on and creating some rules against kind of trademarked content and intellectual property. Then they started moving into other types of content, like harassment or revenge porn. And now they’re moving into issues like hate speech and disinformation.

And when you actually speak to many of the companies, they’re very open about this. They say that, listen, we are moving more towards a community-guidelines approach. One person said to us, you know, we’ve moved away from this public-square model into we’re a community and we have X, Y, and Z rules and that’s how we function. So it’s a move towards kind of a more Rousseauian contract approach.

And the reason for this really varies between companies. So Danielle Citron put out a paper a while ago kind of worried that this was because of censorship creep due to European regulations, that what we’re actually seeing is these companies, it’s much, much easier for them to establish a kind of global rule than it is for them to have a rule in every single country, so they’re taking kind of the least-restrictive and they’re just applying it across the board.

When we spoke to them, they tried to complicate that. They said, you know, this is just because we don’t want people to flee our platforms and we’re trying to keep as many people there as possible, and to do that we know we need to create some restrictions around content. One of the companies just said this was, like, a normal maturation of the company. So I think that that’s—it might be a bit of a false premise that most of these companies are actually just moving towards a limited-restriction model that’s more like the European model of free speech.

MEYER: Just to highlight a point you made there, quickly, and I think properly moderating content could actually be beneficial to the community, draw users, and encourage productive free speech.

CAPLAN: Correct. Sure. I’m Canadian, so this is the—

ROBERTS: Right.

CAPLAN: And also, I was going to say X, Y, and Zed values. (Laughter.)

ROBERTS: I knew you were. I was waiting for it.

CAPLAN: I was, like, Z, yeah.

ROBERTS: I was waiting for it.

CAPLAN: And so for me, that’s not a bad thing. There was a moment where you could start to see that these kind of free speech values adopted by these companies maybe five years ago were actually starting to shift norms in Canada about how we think about speech because it does operate under a limited-restrictions model.

So is it time for these companies to start considering other models for speech? It might be.

MEYER: Adam?

Q: Great panel, thank you very much.

I was wondering if the Chinese companies have any influence on where the U.S. companies are going or on the debate? We know that they employ tens of thousands of moderators to take down content. We know that they’re adopting AI fairly widespread internally. We know they cooperate very, very closely with the government, just recently reports about they were already uploading terms and phrases that the government had given them for content moderation and takedown. So how do we think about the role that the Chinese social media companies are playing? And is that influencing the debate in the U.S. and how the companies are thinking about it in international markets?

LI: And that’s really interesting. So obviously, the free speech environment in China is a little different than it is in the United States. And that’s reflected with the tech industry there and here as well. For those of you who aren’t aware, in China there is effectively an equivalent, a localized national equivalent, not state national, but Chinese equivalent for pretty much every type of technology we have here. There’s a Chinese Facebook, a Chinese Wikipedia, a Chinese Amazon, et cetera. And I think what we see here is a lot of companies from the U.S. and from Europe trying to compete, but not really being able. And I don’t think it’s so much that the Chinese social media companies through their content moderation efforts are influencing U.S. social media companies. But I do think that, generally, this urge to be able to compete with that huge Chinese market, that is influencing U.S. companies.

So, obviously, we’ve seen recently the question of whether or not Google should reengage within China. And that’s, again, why I go back to the main point that companies are made of people. And what we saw there was kind of a conflict within Google, which is still happening right now, about the values, about how Google should be protecting free speech, if it should be protecting free speech, what it means to protect free speech globally and so on.

Is it better to provide some services to a nation or does that create bad international norms, right? Is it better to prioritize being able to support a growing business or is it better to prioritize this sort of principled stance on speech? So we have conflicts even between directors, between executive, and conflicts from employees. And, of course, the public is also involved as well. So I think here what’s really driving possible change or at least conflict for U.S. companies isn’t really wanting to be similar to Chinese companies, but wanting to possibly compete within that Chinese market.

Q: Thank you very much. Tom Dine with the Orient Research Centre.

As I’m listening to all of you, I can picture an ongoing—hello—an ongoing conflict among three academic categories: law school, business school, and political scientists. If we have this panel five years from now, where will that conflict be?

CAPLAN: Good question. I don’t know if I—and I also want to know what category I’m in. (Laughter.)

ROBERTS: I was just going to say, being that I’m not—I don’t—I don’t relate—

CAPLAN: I don’t—I’m not in a business school, but I am an organizational researcher, so—

Q: But you each have got legal departments—

CAPLAN: Right, right, right.

ROBERTS: Right.

Q: —and you have corporate interests.

CAPLAN: Right.

ROBERTS: Right.

CAPLAN: I hope it’s an ongoing battle and debate between these three if that’s the only way that we’re going to start to see these kind of decisions and policies continue to evolve is through having kind of pressure in terms of this normative pressure that we’re placing on companies about making sure that speech rights are being protected, that they’re not overly censoring content, that we’re, you know, tempering that with an understanding of the organizational dynamics of these companies so that we kind of know how to properly regulate them and that we’re establishing laws that aren’t necessarily too tied to the technologies that are being produced right now because those are going to continue to change, but rather the organizational dynamics of these companies and the normative frames that we’d like to preserve.

LI: I think maybe one way to think about your question—so law school, business school, political science—you’re thinking of three types of actors sort of or three types of interests—corporate interests or corporate actors, state interests and state—and state actors in terms of regulation, and then the interests of the international community as a whole.

So I think of that because it’s an interesting place right now for a lot of these tech companies. In regulation, we talk about if we think of them as corporate entities, right, which we know how to regulate, generally speaking, or if we think of them almost akin to nation states due to the power and influence of some of these companies, I think it’s really interesting seeing the way that things have changed. I mean, in many powerful industries, you have the larger industry players interacting with governments on a different level, right?

So what we see now—someone gave this great analogy. It’s no longer, say, the relationship between, for example, Facebook and the EU. It’s not the relationship between a small furniture manufacturer and the state of North Carolina. But it’s the relationship between Belgium and the EU. It’s almost that these companies are so influential that they are almost acting as state actors, they negotiate directly with governments often. So you kind of muddle this law school, business school, political science distinction now. And I’m really curious to see where that’s going to go.

ROBERTS: I think one final comment I’d make, about five years from now, is that I think it behooves us all to reorient in terms of our expectations for solving these issues. In fact, they are—they may not—the specific problems may not be intractable, but on a—on a larger—on a larger stage, this is actually an evolutionary process akin to all other kinds of policy development.

And as Tiffany and Robyn both point out, it is also at the world stage. And so, you know, solutionist orientations, even when I’m sympathetic to the regulatory desires there or AI or any other solution, those are granular. We have to think actually much larger about the impacts and implications over the long term.

MEYER: Great. I think we have time for about one more question from this great discussion.

Sir?

Q: Hi, good morning. Ché Bolden, United States Marine Corps.

A lot of the conversation has been dealing with—dealing with the negative effects of online content. But one of the questions I want to ask, particularly on artificial intelligence, is, what effects, what productive effects do you think that bots and other forms of artificial intelligence can have on the discussion and the development of content going forward?

CAPLAN: I don’t know if I see a positive—(laughter)—

ROBERTS: Well, yeah, I mean, obviously, some of the examples that have been given by panelists are positive. You know, when we think about this issue of differing norms around the world, one thing that the world gets behind typically, at least at the nation state level, is child abuse, right, and intervening upon that. So that’s one great example that we can always hold up.

But I think these issues are so thorny that we are all wary to apply AI as a—as a positive before we’d actually solve the root causes. So social justice and solving inequity issues would be great if we could automate that, but we can’t even not automate that, you know what I mean? So I’m a bit worried and wary about thinking about—

CAPLAN: Yeah, I think I misunderstood the question. Was it automation to take down or automation to distribute content?

Q: I should have used the word “productive,” as opposed to “positive.” However, the presence of bots online is pervasive and most people perceive it as a negative thing. But there are some productive ways to use bots and artificial intelligence in generation and moderation of content.

MEYER: Right. Productive use of bots.

ROBERTS: Oh.

CAPLAN: So in the generation of content, that’s where I was confused. So, actually, so there’s one area where I’m very sad that there is going to—we’re going to see a lot less bots and that’s art, that bots have been used online for lots of reasons, especially on Twitter, to kind of create content that’s really provoking and thought-provoking and amazing and funny. And I think we will start to see some of that go away.

Beyond that, though, I’m not quite certain if I see a productive role. So it would be—it would be kind of beneficial to start removing bots in terms of major—in terms of follower—the ones that are used to increase follower accounts, the ones that are used to amplify content. And part of the reason for that is that we want these spaces to be kind of our de facto public spheres. We’ve been treating them like that. And when we kind of—when we enable all of these different ways where content can be kind of falsely amplified, we start to really distort what public opinion is in these spaces.

MEYER: So I’m going to take advantage of your mention of bots as art to exercise the most important role as presider of this meeting, which is to end it just about on time.

This has been a terrific conversation. Please join me in thanking our panelists. (Applause.)

We’re now going to move to a fifteen-minute coffee reception. And the second session will begin at 10 a.m. sharp. Thank you.

(END)

Top Stories on CFR