🎓

Welcome to AI Ethics

Enter your information to begin Module 3. Your name personalizes your experience and appears on your completion certificate.

First NameLast NameSchool Email

🎓

Bias & Fairness

Students Edition

An AI system that works perfectly on average can still systematically fail for entire communities. Accuracy isn't fairness. This module teaches you to see the difference — and why it matters more than any other problem in AI ethics.

Module 3 of 8 — AI Ethics for Higher Education

Powered by EduEthics.ai

Part 1

Bias Isn't a Bug. It's Baked In.

When people hear "AI bias," they picture a racist programmer deliberately coding discrimination. That almost never happens. The reality is worse: bias is structural. It enters AI systems through the data they're trained on, the choices made during design, and the contexts in which they're deployed.

INDIVIDUAL BIAS

A single person's prejudice. Conscious or unconscious. Affects one decision at a time. Can be identified and corrected through training and oversight.

SYSTEMIC BIAS

Embedded in data, institutions, and systems. Operates at scale — thousands or millions of decisions per second. Often invisible to the people running the system.

A human loan officer who's biased against Black applicants might deny 10 loans unfairly in a year. An AI lending model trained on 20 years of historically discriminatory lending data denies thousands — and nobody at the bank even knows it's happening because the system looks "objective."

Key Insight

AI doesn't create bias. It inherits it from us, then amplifies it to a scale we've never seen before. That's what makes AI bias fundamentally different from human bias — it's not one person's prejudice, it's centuries of systemic inequality compressed into a mathematical model and applied millions of times per day.

Part 2

Where Bias Enters the AI Pipeline

Bias can enter at every single step. Tap each stage to see how.

1

Data Collection

If your training data over-represents one group and under-represents another, the model will perform better for the majority group. Facial recognition trained primarily on light-skinned faces will fail on dark-skinned faces — not because of malice, but because the data was incomplete.

⚠ Bias entry: Who was included? Who was left out? Who collected the data and from where?

2

Data Labeling

Humans label training data — deciding what counts as "positive" or "negative," "relevant" or "irrelevant." Those labels carry the labeler's worldview. In content moderation, what counts as "hate speech" vs. "political expression" depends on who's doing the labeling and their cultural context.

⚠ Bias entry: Whose definitions of "good," "normal," and "acceptable" are being encoded?

3

Feature Selection

Designers choose which variables the model considers. A credit scoring model that includes zip code as a feature is effectively including race, because residential segregation means zip codes correlate heavily with racial demographics. The designer may not intend this, but the math doesn't care about intentions.

⚠ Bias entry: Which features are proxies for protected characteristics? Which relevant features were excluded?

4

Model Training

The algorithm optimizes for whatever metric it's given. If you optimize for "accuracy" using biased historical data, the model will learn to be accurately biased. It's doing exactly what it was told to do — which is the problem.

⚠ Bias entry: What is the model optimizing for? Does the optimization target align with fairness?

5

Evaluation

Models are typically evaluated on overall accuracy. A system that's 95% accurate on average might be 99% accurate for the majority group and 65% accurate for a minority group. If you only look at the average, you'll never see the disparity.

⚠ Bias entry: Was performance tested across demographic groups, or only on aggregate?

6

Deployment & Feedback

Once deployed, biased outputs become new training data. Predictive policing sends officers to neighborhoods with more historical arrests → more arrests happen there → the data "confirms" the prediction → even more officers are sent. The bias amplifies itself.

⚠ Bias entry: Does the deployed system create feedback loops that reinforce existing disparities?

Part 3

The Study That Changed Everything

In 2018, MIT researcher Joy Buolamwini published the Gender Shades study — the most important empirical work on AI bias to date. She tested commercial facial recognition systems from Microsoft, IBM, and Face++ on a benchmark dataset balanced across gender and skin tone.

Error Rates by Demographic Group

Light-skinned men

0.8%

Light-skinned women

7.0%

Dark-skinned men

12.0%

Dark-skinned women

34.7%

The system worked almost perfectly for the group most represented in training data (light-skinned men). It failed at 43 times the rate for the group least represented (dark-skinned women). The overall accuracy was high enough that nobody noticed the disparity — until someone specifically looked for it.

Intersectionality

The worst errors didn't happen at "gender" or "race" alone — they happened at the intersection of both. A system that's 93% accurate for women and 95% accurate for dark-skinned people might still be only 65% accurate for dark-skinned women. You have to test at the intersection to find the problem.

After the study was published, Microsoft and IBM improved their systems significantly. The study demonstrated that measurement drives accountability — you can't fix what you don't test for.

Part 4

The Fairness Impossibility Problem

You'd think "fairness" would be simple: treat everyone equally. It's not. Computer scientists have identified multiple mathematical definitions of fairness — and proven they can't all be satisfied at the same time.

Demographic Parity

Equal selection rates across groups. If 30% of male applicants are hired, 30% of female applicants should be too.

Problem: If qualification rates genuinely differ between groups (due to historical inequity in education access, for example), forcing equal selection rates means using different thresholds for different groups. Critics call this "reverse discrimination." Proponents call it "correcting for historical disadvantage."

Equal Opportunity

Equal true positive rates. Among qualified candidates, the same percentage from each group should be selected.

Problem: This only looks at qualified candidates who are correctly identified. It says nothing about false positives (unqualified people incorrectly selected) or false negatives (qualified people incorrectly rejected). A system can satisfy equal opportunity while still having wildly different false positive rates.

Calibration

When the system says "70% likely," it should be correct 70% of the time — for every group equally.

Problem: A well-calibrated system can still produce outcomes that look deeply unfair at the group level. COMPAS was arguably well-calibrated (its risk scores were predictive) but still flagged Black defendants as high-risk at twice the rate of white defendants. Accuracy and equity are different things.

Individual Fairness

Similar individuals should receive similar outcomes, regardless of group membership.

Problem: Defining "similar" is the entire challenge. Similar in what respects? If two applicants have the same GPA but one attended a well-funded suburban school and the other a chronically underfunded urban school, are they "similar"? The definition of similarity embeds the same value judgments we're trying to avoid.

The Impossibility Theorem

In 2016, researchers proved mathematically that when base rates differ between groups (which they almost always do, due to historical inequity), you cannot simultaneously achieve demographic parity, equal opportunity, AND calibration. You must choose which definition of fairness matters most — and that choice is a human values decision, not a technical one.

Interactive Exercise

Spot the Bias

Each scenario contains a real-world AI bias pattern. Identify where the bias enters and what type it is. Tap to place each bias type with the correct scenario.

Tap & Place Exercise

Tap a bias type, then tap the scenario it matches. Sort all 6 correctly to advance.

Training data gap — underrepresented groups

Proxy discrimination via correlated feature

Feedback loop amplifying existing disparity

Historical bias encoded in labeled examples

Zip code used as stand-in for race

Biased output becomes next round's input

Data Bias

1

2

Proxy Bias

1

2

Feedback Loop

1

2

All 6 correct! You can now identify the three major patterns of AI bias: data gaps, proxy discrimination, and feedback loops.

Some items are misplaced. Tap placed tiles to return them, then try again.

Part 5

Feedback Loops: When Bias Amplifies Itself

The most dangerous type of AI bias isn't a one-time error. It's a cycle that gets worse over time.

Predictive Policing Feedback Loop

AI predicts "high crime" in historically over-policed neighborhoods

→

More officers sent to those neighborhoods

→

More arrests made (because more officers are looking)

→

New "crime data" confirms the prediction

The AI doesn't predict where crime happens. It predicts where police will find crime — because they're already looking there. The model is accurate in a narrow sense (officers DO make more arrests where they're sent) while being fundamentally misleading about actual crime distribution.

Content Recommendation Feedback Loop

User watches one extreme video

→

Algorithm recommends similar, more extreme content

→

User watches more (engagement metrics go up)

→

Algorithm "learns" user prefers extreme content

YouTube's own research found that 70% of watch time is driven by recommendation algorithms. The system doesn't radicalize people intentionally — it optimizes for engagement, and outrage is engaging. The feedback loop does the rest.

Breaking the Loop

Feedback loops can only be broken by external intervention — human oversight, mandatory auditing, or regulatory limits on self-reinforcing systems. The system itself has no mechanism to recognize or correct the cycle because, from its perspective, it's performing exactly as optimized.

Part 6

Algorithmic Auditing: Testing for What You Can't See

If AI bias is invisible to the people running the system, how do you find it? The answer is algorithmic auditing — systematically testing AI systems for disparate impact across demographic groups.

INTERNAL AUDIT

The company tests its own system. Advantage: access to full model and data. Disadvantage: conflicts of interest — finding bias means admitting your product is flawed.

EXTERNAL AUDIT

Independent researchers test the system from outside. Advantage: no conflict of interest. Disadvantage: often can only test inputs/outputs, not examine the model itself.

Joy Buolamwini's Gender Shades study was an external audit. She couldn't see inside Microsoft's or IBM's models — she could only send in faces and measure what came out. That was enough to expose massive disparities.

Impact of Algorithmic Auditing

2018

Gender Shades study published. IBM and Microsoft respond publicly.

2019

Follow-up study shows IBM reduced error disparity by 65% after being audited. Measurement drove accountability.

2020

IBM exits the facial recognition market entirely, citing human rights concerns.

2021

EU proposes AI Act with mandatory bias auditing requirements for high-risk AI systems.

2023

NYC passes Local Law 144 requiring annual bias audits for AI hiring tools — the first law of its kind.

The Lesson

One graduate student's audit of three commercial systems triggered corporate policy changes, an industry exit, and legislative action across two continents. Auditing works — not because it fixes systems automatically, but because it makes bias visible to the people who have the power to act.

What Would You Do?

Branching Scenario: The Healthcare Algorithm

Stage 1 of 3

You're on the ethics committee at a large hospital system. A vendor pitches an AI tool that predicts which patients are most likely to develop serious health complications, allowing doctors to intervene early. The vendor shows impressive accuracy numbers: 92% overall accuracy in clinical trials.

What's your first question?

How much does it cost?

What's the accuracy breakdown by demographic group?

How many hospitals are already using it?

Missing the Point

Cost matters, but it's not the first question. After this module, your first instinct should be to ask about group-level performance. Remember Gender Shades: 92% overall can mask 65% for specific populations.

Exactly Right

This is the Gender Shades lesson in action. Overall accuracy tells you nothing about whether the system works equally well for Black patients, elderly patients, women, or low-income communities. A 2019 study found that a widely-used healthcare algorithm systematically underestimated how sick Black patients were — affecting an estimated 70 million patients.

Popularity ≠ Quality

Adoption rate doesn't tell you about fairness. Many widely-used systems have been found to be biased only after independent auditing. The right question is about performance across demographic groups.

AI Interaction Lab

Explore Bias & Fairness With a Live AI

Ask questions about AI bias, fairness definitions, algorithmic auditing, or anything from this module.

Live AI Teaching Assistant20 messages remaining

Module 3 Checkpoint

Your Key Takeaways

Seven concepts you need to carry forward.

🔬

Structural Bias

AI bias isn't individual prejudice — it's historical inequity encoded in data, amplified by algorithms, and deployed at scale.

🔗

Six Pipeline Entry Points

Bias enters at data collection, labeling, feature selection, training, evaluation, and deployment. Every step is an opportunity for bias and an opportunity for intervention.

📊

Gender Shades

0.8% error for light-skinned men vs. 34.7% for dark-skinned women. Overall accuracy masks group-level disparities. Always test at the intersection.

⚖️

Fairness Impossibility

You can't satisfy all fairness definitions simultaneously. Choosing which definition matters most is a human values decision, not a technical one.

🔄

Feedback Loops

Biased outputs become biased training data. Predictive policing doesn't predict crime — it predicts where police will find crime because they're already looking there.

🎯

Proxy Discrimination

Removing protected attributes doesn't remove bias. Zip codes proxy for race. Employment gaps proxy for gender. The bias hides in correlated features.

🔍

Algorithmic Auditing

Systematic testing for disparate impact. One MIT study triggered corporate changes, an industry exit, and legislation on two continents. Measurement drives accountability.

Module 3 Assessment

Check Your Understanding

5 questions drawn from the module. You need 80% to pass.

Module Complete

Your Results

0/5

0%

STUDY GUIDE

Download the study guide for this module as a reference.

📄 Download Module 03 Study Guide

⚙ Instructor Settings