DALL-E 2, an image-generating artificial intelligence (AI) has captured the public’s attention with stunning portrayals of Godzilla-eating Tokyo and photorealistic images of astronauts riding horses in space. The model is the newest iteration of a text-to-image algorithm, an AI model that can generate images based on text descriptions. OpenAI, the company behind DALL-E 2, used a language model, GPT-3, and a computer vision model, CLIP, to train DALL-E 2 using 650 million images with associated text captions. The integration of these two models made it possible for OpenAI to train DALL-E 2 to generate a vast array of images in many different styles. Despite DALL-E 2’s impressive accomplishments, there are significant issues with how the model portrays people and how it has acquired biases from the data it was trained on.
Existing Issues with DALL-E 2
There were frequent and early warnings that DALL-E 2 would generate racist and sexist images. OpenAI’s “red team”, a group of external experts charged with testing the security and integrity of the model, found recurring biases in DALL-E 2’s creations. Early tests from the red team showed that the model disproportionately generated images of men, oversexualized women, and played into racial stereotypes. When given words like “flight attendant” or “assistant” the model would exclusively generate images of women while terms like “CEO” and “builder” depicted men. As a result, half of the red team researchers advocated for releasing DALL-E 2 to the public without the ability to create faces.
The problem of discriminatory AI models predates the development of DALL-E 2. External researchers found implicit bias and stereotyping issues in the models used to form DALL-E 2, and CLIP and GPT-3 both generated insensitive text and images. One of the primary reasons models like DALL-E 2, GPT-3, and CLIP were found to construct harmful stereotypes is because the datasets used to train these large models are inherently biased since they are built on data collected from human decisions that reflect societal or historical inequities.
Despite these concerns, OpenAI recently announced that it would begin selling a beta version of DALL-E 2 to a waitlist of one million people. The company did announce a software update before the beta launch that made images of people twelve times more diverse and proposed that it would continually tweak the model to address bias as more people use it. However, critics have said that this change may amount to nothing more than a superficial fix given that it doesn’t address the bias in CLIP and GPT-3, the models used to build DALL-E 2.
Artificial Intelligence and Ethical Concerns
Organizations across the private and public sectors have publicized models that exacerbate existing social and systemic issues within the United States. For example, COMPAS, a machine learning algorithm used in the U.S. criminal justice system, was trained to predict the likelihood of a defendant becoming a recidivist. COMPAS misclassified black defendants as at high risk of reoffending at nearly twice the rate it did white defendants. The use of AI tools has been fraught with the risks of unintentional harm, but some states are now weaponizing AI to target or exclude people based on historical biases, or to augment campaigns of repression.
Perhaps the most prominent example comes from the province of Xinjiang in China. The People’s Republic of China (PRC) has implemented facial recognition technology across cities to monitor ethnic minorities like the Uyghurs, in one of the first known examples of a government using AI specifically for racial profiling. This technology is migrating across borders into other authoritarian states in the region as the Myanmar junta recently bought facial recognition cameras from Chinese-based companies which are being used to crack down on political dissidents and opponents.It should be noted that European and U.S. companies also provide these technologies to regimes with dubious human rights records. China exports AI-based surveillance technology to over sixty authoritarian regimes, a number that is rapidly increasing in part due to the influence of the Belt and Road Initiative (BRI). The U.S. government needs to confront the growing power of AI, and the growing market for AI surveillance products by propagating norms and engaging with allies to reduce the harmful effects of AI both at home and abroad.
Building Norms Across the World
Startups like OpenAI are leading the charge on technical aspects of reducing bias in AI by pre-processing datasets to eliminate harmful data and the largest technology companies have publicized codes of ethics for AI development and applications. After receiving major pushback against their facial recognition tools, Microsoft and Facebook announced that they would no longer sell facial recognition technology over concerns about its misuse.
Private sector self-restraint is, however, not enough. The U.S. government needs a broad initiative on how to regulate algorithmic bias. And it is imperative that the United States work with international allies to build norms for developing responsible AI.
There are already some mechanisms for cooperation; the National AI Research Resource Task Force aims to connect experts from government, academia, and the private sector to consider issues related to AI governance and research. The European AI Alliance is an existing forum for countries to discuss ethical concerns with AI outlined by the European AI Strategy. The Organization for Economic Cooperation and Development also launched its own organization designed to encourage the ethical use of AI. The United States has also joined a Group of Seven AI initiative which aims to counter China’s use of AI to curtail civil liberties. The United States can strengthen existing diplomatic relations with European allies and avoid disputes like those surrounding the General Data Privacy Regulation from occurring in the development of AI by building norms on these issues through these forums.
Furthermore, the United States should bolster partnerships in diverse regions of the world to address cultural trends that can reinforce bias in AI. The majority of AI research and development occurs in Europe, China, and the United States, which means that many AI applications used in the future will reflect the cultural biases of a few large countries. Existing organizations such as the Quadrilateral Security Dialogue (the Quad), which includes Australia, India, Japan, and the United States, should be leveraged to further cooperation on AI development. The Quad already focuses on technological cooperation, but cooperation on AI development between the other three members besides the United States is lacking. Fostering multilateral collaboration through the Quad, and partnerships like it can lead to joint research opportunities on how to mitigate algorithmic social bias in regions beyond Europe and China. There are undoubtedly other ways researchers can reduce bias in AI, but building organizations and norms designed to decrease bias in AI and encouraging AI research in more countries would go a long way towards eliminating some of the problems encountered in the field of AI.
Pragya Jain is the intern for the Digital and Cyberspace Program at the Council on Foreign Relations.