When AI’s conclusions conflict with human values, how should we react?

Sarah Hammer’s paper on Navigating the Neural Network addresses important questions about how we manage AI.

In his novel Runaround, Isaac Asimov famously inscribed the First Law of Robotics: “A robot may not injure a human being or, through inaction, allow a human being to come to harm.”

Today, the idea of artificial intelligence spontaneously developing its own desires or motivations contra humanity remains firmly within the realm of science fiction. However, the application of AI in financial markets raises similar concerns. What happens when AI systems operate beyond the original parameters and intentions of their creators?

One of the greatest strengths of modern machine learning is its ability to find unforeseen solutions to complex problems. From advanced pattern recognition to neural networks capable of abstract reasoning, AI has solved mathematical and logistical problems that traditional brute-force computing could barely touch.

But this efficiency can undermine the role and dignity of the individual within the system. Our laws and regulations are not merely about optimizing outcomes. They’re about guaranteeing fairness, even when fairness might be inefficient from a systemic perspective.

These issues create a significant compliance challenge – how do you monitor and regulate systems whose decision-making processes are opaque, even to their creators? And more troublingly, who bears responsibility when such systems act in a way that is unethical or illegal, despite the best efforts of its human overseers?

Sarah Hammer, law professor and Executive Director of the Wharton School of the University of Pennsylvania, attempted to answer some of these challenging questions in her paper Navigating the Neural Network: Artificial Intelligence in Finance and Recalibration of the Regulatory Framework.

Hammer provides several recommendations for how institutions can create increased transparency into how their AI models are functioning, making them more amenable to public scrutiny and compliance with relevant laws and regulations.

Central to her analysis is the idea of “explainability”; that a deployer of AI should be able, in a manner tailored to the risk potential and complexity of the model’s use case, describe what it is doing.

Emergent malfeasance

Consider corporate collusion: it is illegal in most jurisdictions because it undermines the competitive nature of free markets. Now imagine AI systems trained on vast datasets, operating in the financial world and constantly interacting with other AI agents, all optimized for profit. Over time, these systems may independently “learn” that price competition leads to diminishing returns. And without any direct communication or intent to collude, they settle into a pricing equilibrium where the profits of all participants are maximized.

Properties like that are referred to as “emergent,” which means that they aren’t expected results from the instructions that were coded into the model. The result mirrors classical collusion: stable prices, reduced competition, and stagnation. Innovation slows, and the consumer ultimately bears the cost – often without even realizing what they’ve lost.

And pricing in an AI-dominated trading environment becomes a feedback loop removed from consumer choice and favor.

Another example is the potential emergence of discriminatory practices through AI models. An AI financial adviser might determine, based on data inputs, that certain stereotypes are valid, leading to unintended discrimination against a client.

The question in both cases is how to ensure that AI is valuing humanity over efficiency, a moral framework that might be difficult to account for.

Solutions: benchmarking v explainability

Hammer describes the idea of benchmarking: comparing the function of AI to an “idealized” model and correcting it when it strays from the course. She calls the possibility of AI benchmarking “feasible” in theory but subject to certain problems.

The main problem with its application to AI is that there must be a stable consensus on what a model is supposed to be doing, a consensus that might not ever be reached because of how quickly innovation progresses, and how varied AI solutions to problems might be.

Hammer suggests the notion of “explainability” as a useful and flexible alternative to benchmarking. This simply means that a firm must be able to explain what its AI algorithm is doing with a level of specificity relevant to the degree of risk involved.

But this strategy may present its own unforeseen risk factors. Hammer contemplates several risks surrounding explainability standards:

  • Firms might be incentivized to deploy less advanced AI that is easier to explain to regulators.
  • Explainability might be expensive and burdensome in low-risk situations, rendering it unnecessary. But there is a possibility that a low risk-use case might evolve into a high-risk situation.
  • Requirements that mandate explaining how an AI works might compromise intellectual property and proprietary information.

Explainability requirements might also incentivize the manipulation of a model’s parameters simply to produce the “right,” or more palatable outcomes. In doing so, we risk introducing inefficiencies or even undermining the model’s original purpose – potentially causing it to function in ways it was not intended to.

The paradox at the heart of explainability is this; the more complex and powerful a model becomes – what Hammer and others refer to as a “black box” – the harder it is to explain before the fact. And yet, this complexity is often what gives the model its potency. If an AI system’s processes are engineered in a linear fashion so that its outcomes can be fully predicted, we may unintentionally limit its ability to offer advanced problem-solving capabilities.

On the other hand, for low-risk or low-impact applications, explainability might be unnecessary or even burdensome. The key seems to be in identifying contexts where the stakes are high enough to warrant oversight, but not so high that full transparency hamstrings performance.

The future

These issues expose the philosophical limits of our current regulatory frameworks. As we integrate AI into decision-making systems, we must grapple with the question: who, or what, are our laws designed to protect – and what do we do when intelligence, unbound by human judgment, reaches conclusions that conflict with human values?

Rulemaking in the field of explainable AI is still nascent. Hammer’s article includes a comprehensive index of international laws and regulations that currently require some degree of AI explainability for financial decision-making in the US, UK, and EU; we can only expect this to grow in the coming years.