Teaching Morality to Machines
Jane Zavalishina is the CEO of Yandex Data Factory. Vyacheslav Polonski is a PhD student at the University of Oxford and the CEO of Avantgarde Analytics.
For years, experts have warned against the unanticipated effects of general artificial intelligence (AI) on society. Ray Kurzweil predicts that by 2029 intelligent machines will be able to outsmart human beings. Stephen Hawking argues that “once humans develop full AI, it will take off on its own and redesign itself at an ever-increasing rate.” Elon Musk warns that AI may constitute a “fundamental risk to the existence of human civilization.”
More on:
More often than not, these dystopian prophecies have been met with calls for a more ethical implementation of AI systems; that somehow engineers should imbue autonomous systems with a sense of ethics. According to some AI experts, our future robot overlords can be taught right from wrong, akin to a good Samaritan that will always act justly on its own and help humans in distress.
Although this future is still decades away, today’s narrow AI applications need to be imbued with better moral decision-making. This is particularly necessary when algorithms make decisions about who gets access to loans, who gets promoted at work, or when self-driving cars have to calculate the value of a human life in hazardous situations.
Teaching morality to machines is hard because humans can’t objectively convey morality in a way that makes it easy for a computer to process. In moral dilemmas, humans tend to rely on gut feeling instead of cost-benefit calculations. Machines, on the other hand, need objective metrics that can be clearly measured and optimized. For example, an AI machine can excel in a game with clear rules and boundaries. Alphabet’s DeepMind was able to beat the best human player of Go. Meanwhile, OpenAI amassed “lifetimes” of experiences to beat the best human players at this year’s Dota 2 tournament, a video game competition.
But in real-life situations, optimization problems are vastly more complex. For example, how can a machine be taught to algorithmically maximise fairness or to overcome racial and gender biases in its training data? A machine cannot be taught what is fair unless the human teaching it has a precise conception of what fairness is.
This has led some authors to worry that a naïve application of algorithms to everyday problems could amplify structural discrimination and reproduce biases in the data on which they are based. In the worst case scenarios, algorithms could deny services to minorities, impede employment opportunities or get the wrong political candidate elected.
More on:
Based on our experiences in machine learning, there are three ways to make machines more ethical.
First, AI researchers and ethicists need to formulate ethical values as quantifiable parameters. In other words, they need to provide machines with explicit answers and decision rules to any potential ethical dilemmas it might encounter. This would require that humans agree amongst themselves on the most ethical course of action in any given situation—a challenging but not impossible task. For example, Germany’s Ethics Commission on Automated and Connected Driving has recommended that automated cars be programmed to prioritize the protection of human life above all else. In the event of an unavoidable accident, the car should be “prohibited to offset victims against one another.” In other words, a car shouldn’t be able to choose whether to kill one person or many in an unavoidable crash situation.
Second, researchers need to collect enough data on explicit ethical measures to appropriately train AI algorithms. Getting appropriate data is challenging, because ethical norms cannot be always clearly standardised. Different situations require different ethical approaches, and some ethical dilemmas may not have a single best course of action. One way of solving this would be to crowdsource potential solutions to moral dilemmas from millions of humans. For instance, MIT’s Moral Machine project shows how crowdsourced data can be used to train machines to make moral decisions in the context of self-driving cars.
Third, policymakers need to implement guidelines that make AI decisions with respect to ethics more transparent. Simply blaming an algorithm for making a mistake is not acceptable, but demanding full algorithmic transparency may also be technically untenable, because neural networks are simply too complex to be scrutinised by human inspectors. Instead, there should be more transparency on how engineers quantified ethical values before programming them, as well as the outcomes that the AI has produced as a result of these choices. For self-driving cars, for instance, this could imply that detailed logs of all automated decisions are kept at all times to ensure their ethical accountability.
Failing to imbue ethics into an AI system allows an algorithm to decide what’s best for us. For example, in an unavoidable accident situation, the self-driving car needs to make a decision. But if the car’s designers fail to specify a set of ethical values that could act as decision guides, the AI system may come up with a solution that causes more harm.
Machines cannot be assumed to be inherently capable of behaving morally. Humans must teach them what morality is, how it can be measured and optimized. For AI engineers, this may seem like a daunting task. After all, defining moral values is a challenge mankind has struggled with throughout its history. Nevertheless, the state of AI research requires engineers and ethicists to define morality and quantify it in explicit terms. Engineers cannot build a “Good Samaritan AI” as long as they lack a formula for the Good Samaritan human.