Tackling training bias in machine learning - Embedded.com

Tackling training bias in machine learning

Teaching anyone about “fairness” is a laudable goal.

As humans, we may not necessarily agree on what’s fair. It sometimes depends on the context. Teaching kids to be fair — both at home and in school — is fundamental, but it’s easier said than done. With this in mind, how can we, as a society, communicate the nuances of “being fair” to artificial intelligence (AI) systems?

A team of researchers at IBM Research is taking the first crack at this conundrum. IBM is rolling out a tool kit for developers called “AI Fairness 360.” As part of this effort, IBM is offering businesses a new “cloud-based, bias-detection, and mitigation service” that corporations can use to test and verify how AI-driven systems are behaving.

In a phone interview with EE Times, Saska Mojsilovic, a Fellow at IBM Research, told us that scientists and AI practitioners have been far too focused on the accuracy of AI. Typically, the first question that people ask about AI is, “Can machines beat humans?”

But what about fairness? The fairness void in AI has the potential to induce catastrophic consequences in, for example, health care or autonomous vehicles, she said.

What if a dataset used to train a machine is biased? If AI can’t explain how it came to a decision, how could we verify its “rightness?” Can AI reveal if data has been somehow manipulated during AI processing? Could AI assure us that its data has never been attacked or compromised, including during pre- and post-processing?

In short, is there any such thing as introspective AI? The simple answer: No.

Without being transparent to AI users, developers, and practitioners, AI systems cannot gain trust from society, said Mojsilovic.

Decomposing fairness
A bigger question is how to teach the machine what fairness is. Mojsilovic noted, “Because we are scientists, the first thing we did was to decompose ‘fairness.’ We needed to get our hands around it.” They broke down fairness in terms of metrics, algorithms, and bias practiced in AI implementation.

Kush Varshney, Research Scientist, IBM, explained that the team looked at bias and fairness in AI algorithms and AI decision-making. “There is fairness to individuals and there is fairness to groups. We looked at different attributes of groups — ranging from gender to race. Legal and regulatory issues are also considered.” In the end, the team ended up measuring 30 different metrics to look for bias in datasets, AI models, and algorithms.

These findings are incorporated into the AI Fairness 360 toolbox that IBM launched this week. The company described it as “a comprehensive open-source toolkit of metrics to check for unwanted bias in datasets and machine-learning models.”

Mitigating bias throughout the AI lifecycle (Source: IBM)

Although many scientists are already working to spot discrimination in AI algorithms, Mojsilovic said that IBM’s approach differs by including algorithms not just to find bias but also a tool for debiasing.

On a basic level, you’ve got to ask: Computer scientists — defining fairness? This is a task normally assigned to social scientists? Aware of this incongruity, IBM made it clear that neither Mojsilovic nor Varshney is working in a vacuum. They brought in a host of scholars and institutes. Varshney participated in the Uehiro-Carnegie-Oxford Ethics Conference sponsored by the Carnegie Council for Ethics in International Affair. Mojsilovic participated in an AI Workshop in Berkeley, California, sponsored by the UC Berkeley Law School.

Is an algorithm neutral?
Social scientists have been pointing out the issue of AI bias for some time.

Young Mie Kim, professor, School of Journalism and Mass Communication at University of Wisconsin—Madison, explained, “AI discrimination (or AI bias) can happen when it implicitly or explicitly reinforces existing unequal social orders and biases (e.g., gender, race, age, social/economic status, etc.).” Examples range from sampling errors (e.g., under-representation of certain demographics due to inappropriate or difficulties in sampling methods) to human biases in machine training (modeling). Kim argued that AI bias exists even with “strategic decisions” in design or modeling, such as political advertising algorithms.

In her recent study entitled “Algorithmic Opportunity: Digital Advertising and Inequality of Political Involvement,” Kim demonstrated how inequality can be reinforced in algorithm-based decisions.

The technical community might argue that “an algorithm is neutral” or can be “educated” (trained). Kim noted, “That does not acknowledge that biases enter at any stage in algorithm development.”

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.