Bias & Fairness
Enter your information to begin Module 3. Your name personalizes your experience and appears on your completion certificate.
When people hear "AI bias," they picture a racist programmer deliberately coding discrimination. That almost never happens. The reality is worse: bias is structural. It enters AI systems through the data they're trained on, the choices made during design, and the contexts in which they're deployed.
A single person's prejudice. Conscious or unconscious. Affects one decision at a time. Can be identified and corrected through training and oversight.
Embedded in data, institutions, and systems. Operates at scale — thousands or millions of decisions per second. Often invisible to the people running the system.
A human loan officer who's biased against Black applicants might deny 10 loans unfairly in a year. An AI lending model trained on 20 years of historically discriminatory lending data denies thousands — and nobody at the bank even knows it's happening because the system looks "objective."
AI doesn't create bias. It inherits it from us, then amplifies it to a scale we've never seen before. That's what makes AI bias fundamentally different from human bias — it's not one person's prejudice, it's centuries of systemic inequality compressed into a mathematical model and applied millions of times per day.
Bias can enter at every single step. Tap each stage to see how.
If your training data over-represents one group and under-represents another, the model will perform better for the majority group. Facial recognition trained primarily on light-skinned faces will fail on dark-skinned faces — not because of malice, but because the data was incomplete.
Humans label training data — deciding what counts as "positive" or "negative," "relevant" or "irrelevant." Those labels carry the labeler's worldview. In content moderation, what counts as "hate speech" vs. "political expression" depends on who's doing the labeling and their cultural context.
Designers choose which variables the model considers. A credit scoring model that includes zip code as a feature is effectively including race, because residential segregation means zip codes correlate heavily with racial demographics. The designer may not intend this, but the math doesn't care about intentions.
The algorithm optimizes for whatever metric it's given. If you optimize for "accuracy" using biased historical data, the model will learn to be accurately biased. It's doing exactly what it was told to do — which is the problem.
Models are typically evaluated on overall accuracy. A system that's 95% accurate on average might be 99% accurate for the majority group and 65% accurate for a minority group. If you only look at the average, you'll never see the disparity.
Once deployed, biased outputs become new training data. Predictive policing sends officers to neighborhoods with more historical arrests → more arrests happen there → the data "confirms" the prediction → even more officers are sent. The bias amplifies itself.
In 2018, MIT researcher Joy Buolamwini published the Gender Shades study — the most important empirical work on AI bias to date. She tested commercial facial recognition systems from Microsoft, IBM, and Face++ on a benchmark dataset balanced across gender and skin tone.
The system worked almost perfectly for the group most represented in training data (light-skinned men). It failed at 43 times the rate for the group least represented (dark-skinned women). The overall accuracy was high enough that nobody noticed the disparity — until someone specifically looked for it.
The worst errors didn't happen at "gender" or "race" alone — they happened at the intersection of both. A system that's 93% accurate for women and 95% accurate for dark-skinned people might still be only 65% accurate for dark-skinned women. You have to test at the intersection to find the problem.
After the study was published, Microsoft and IBM improved their systems significantly. The study demonstrated that measurement drives accountability — you can't fix what you don't test for.
You'd think "fairness" would be simple: treat everyone equally. It's not. Computer scientists have identified multiple mathematical definitions of fairness — and proven they can't all be satisfied at the same time.
In 2016, researchers proved mathematically that when base rates differ between groups (which they almost always do, due to historical inequity), you cannot simultaneously achieve demographic parity, equal opportunity, AND calibration. You must choose which definition of fairness matters most — and that choice is a human values decision, not a technical one.
Each scenario contains a real-world AI bias pattern. Identify where the bias enters and what type it is. Tap to place each bias type with the correct scenario.
The most dangerous type of AI bias isn't a one-time error. It's a cycle that gets worse over time.
The AI doesn't predict where crime happens. It predicts where police will find crime — because they're already looking there. The model is accurate in a narrow sense (officers DO make more arrests where they're sent) while being fundamentally misleading about actual crime distribution.
YouTube's own research found that 70% of watch time is driven by recommendation algorithms. The system doesn't radicalize people intentionally — it optimizes for engagement, and outrage is engaging. The feedback loop does the rest.
Feedback loops can only be broken by external intervention — human oversight, mandatory auditing, or regulatory limits on self-reinforcing systems. The system itself has no mechanism to recognize or correct the cycle because, from its perspective, it's performing exactly as optimized.
If AI bias is invisible to the people running the system, how do you find it? The answer is algorithmic auditing — systematically testing AI systems for disparate impact across demographic groups.
The company tests its own system. Advantage: access to full model and data. Disadvantage: conflicts of interest — finding bias means admitting your product is flawed.
Independent researchers test the system from outside. Advantage: no conflict of interest. Disadvantage: often can only test inputs/outputs, not examine the model itself.
Joy Buolamwini's Gender Shades study was an external audit. She couldn't see inside Microsoft's or IBM's models — she could only send in faces and measure what came out. That was enough to expose massive disparities.
One graduate student's audit of three commercial systems triggered corporate policy changes, an industry exit, and legislative action across two continents. Auditing works — not because it fixes systems automatically, but because it makes bias visible to the people who have the power to act.
You're on the ethics committee at a large hospital system. A vendor pitches an AI tool that predicts which patients are most likely to develop serious health complications, allowing doctors to intervene early. The vendor shows impressive accuracy numbers: 92% overall accuracy in clinical trials.
What's your first question?
Cost matters, but it's not the first question. After this module, your first instinct should be to ask about group-level performance. Remember Gender Shades: 92% overall can mask 65% for specific populations.
This is the Gender Shades lesson in action. Overall accuracy tells you nothing about whether the system works equally well for Black patients, elderly patients, women, or low-income communities. A 2019 study found that a widely-used healthcare algorithm systematically underestimated how sick Black patients were — affecting an estimated 70 million patients.
Adoption rate doesn't tell you about fairness. Many widely-used systems have been found to be biased only after independent auditing. The right question is about performance across demographic groups.
Ask questions about AI bias, fairness definitions, algorithmic auditing, or anything from this module.
Seven concepts you need to carry forward.
AI bias isn't individual prejudice — it's historical inequity encoded in data, amplified by algorithms, and deployed at scale.
Bias enters at data collection, labeling, feature selection, training, evaluation, and deployment. Every step is an opportunity for bias and an opportunity for intervention.
0.8% error for light-skinned men vs. 34.7% for dark-skinned women. Overall accuracy masks group-level disparities. Always test at the intersection.
You can't satisfy all fairness definitions simultaneously. Choosing which definition matters most is a human values decision, not a technical one.
Biased outputs become biased training data. Predictive policing doesn't predict crime — it predicts where police will find crime because they're already looking there.
Removing protected attributes doesn't remove bias. Zip codes proxy for race. Employment gaps proxy for gender. The bias hides in correlated features.
Systematic testing for disparate impact. One MIT study triggered corporate changes, an industry exit, and legislation on two continents. Measurement drives accountability.
5 questions drawn from the module. You need 80% to pass.