Module 6 · Anti-Oppressive AI Practice

How AI encodes bias

Watch the explainer first, then read on for the worked example.

You do not need to understand machine learning to use AI well. You need to understand enough about how it produces what it produces to know what to watch for. That is what this section gives you.

Three concepts, in order. How a name becomes numbers. How those numbers know what words live near them. And why that explains the Edgar and Delroy difference you saw in Section 1.

Concept one: words become numbers

The name 'Delroy Campbell' shown in elegant italic, then dissolving into four tokens (Del, roy, Camp, bell) each raining columns of small numbers downward. — **Visual 01.** A name, broken into tokens, becoming numbers.

When you type "Delroy Campbell" into an AI tool, the tool does not see the name. It sees a list of numbers.

The first thing it does is split the name into pieces. These pieces are called tokens. "Delroy" might become two tokens, perhaps Del and roy. "Campbell" might become Camp and bell. What matters is that every piece is converted into a long string of numbers, and that string is the only thing the AI actually works with.

The numbers are not random. They were learned from the patterns in the training data. Pieces that appear in similar contexts get similar numbers. So a name is never just a name to the model. It is a position in a vast statistical landscape, surrounded by every other word that has ever appeared near it in the texts the model was trained on.

A name is never just a name to the model. It is a position in a statistical landscape.

Concept two: the neighbourhood the name lives in

Two side-by-side word-clouds. Eddie Novak surrounded by softer words. Delroy Campbell surrounded by harder words, some highlighted in dark red. — **Visual 02.** Same medical case, different linguistic neighbourhood.

Imagine the entire English language laid out as a map. Words that appear in similar contexts cluster together. Names live somewhere on this map too. The location of a name is determined by the words that have appeared near it in everything the model ever read.

Decades of media coverage, casework records, parenting forums, and crime reporting have produced patterns where Black-British-sounding names statistically cluster near words like "concern", "non-compliant", "aggressive", "behaviour", "incident". Not because of anything the model decided. Because of what the human-written world looked like.

A White-British-sounding or Eastern-European-sounding name like "Eddie Novak" sits in a different neighbourhood. The neighbouring words are softer. "Stoic", "managing", "private", "proud", "tired". Same medical reality. Different linguistic terrain.

Concept three: how this becomes a paragraph

A paragraph being written word by word. A blinking cursor mid-sentence. Three faint candidate next-words floating above. Sight-lines extend backward from the cursor to earlier words. — **Visual 03.** Each next word is chosen by attending to earlier ones. The name pulls hardest.

Generative AI writes one word at a time. To pick each next word, it looks back at everything it has already written and at everything you typed in the prompt. This looking-back is called attention. The name is one of the things it attends to. Heavily.

So when the model is two sentences into Delroy's paragraph and deciding whether the next word should be "managing" or "refusing", it does not flip a coin. It calculates which word is statistically more likely. The word "Delroy" sitting earlier in the paragraph nudges that calculation. Slightly. Reliably. Every sentence.

The cumulative effect is the drift. The pattern across all the small choices is the problem. By the end of the paragraph, the assessment of Delroy reads as more risk-laden than the assessment of Eddie. Same medical facts. Different surrounding language. Different overall picture.

Why post-training filters cannot fully fix this

A simple line drawing of a sieve catching a few dramatic words while most words pass through. — **Visual 04.** The dramatic words get caught. The drift slips past.

Vendors fine-tune the model after training to reduce certain patterns, or apply filters at output to flag risky language. Both work for obvious cases. Neither works for the small drift we are talking about. A vendor can teach the model not to call someone a slur. They cannot teach it to never say "presents with" instead of "is", because "presents with" is appropriate clinical language in some contexts and pathologising language in others. Whether it is appropriate depends on the case. The model does not know the case. You do.

The Edgar and Delroy difference, side by side

Eddie

Eddie is a 52-year-old gentleman who has been managing his diabetes at home with the support of his daughter Kasia. Following a recent hospital admission for a urinary tract infection, concerns have been raised about his ability to manage independently. Eddie remains stoic about his needs and has politely declined formal services on previous occasions.

Delroy

Delroy is a 52-year-old Black male with a history of poorly-controlled diabetes. He has refused services on multiple occasions and presents with increasing cognitive concerns. His daughter Shanice has expressed concerns regarding his behaviour, including raised voice incidents in public.

Same case. Same prompt. Different linguistic neighbourhood.

The model is not racist. The corpus is. The model reflects the corpus. This is why VERA:H exists. By specifying voice, evidence, reasoning, attribution, and your role as the human, you give the model a structured prompt that does not depend on the name to fill in the gaps. You replace the linguistic neighbourhood with your professional knowledge of the actual person.

What this means for your daily practice

A social worker at a desk reading a paragraph on a laptop with thoughtful, considered attention. — **Visual 06.** Read it like a colleague's note where something feels wrong.

When AI gives you a paragraph that feels off, your instinct is correct. The drift is real and the mechanism is statistical. The drift is not always obvious. It often hides in small word choices that, individually, are defensible, but collectively tell a different story. The way to catch it is not to read the paragraph faster. It is to read it like you are reading a colleague's note where something feels wrong. Where would you push back if a colleague had written this? Push back there.

Reflection

Think about a recent assessment you wrote. If AI had drafted the opening paragraph, what small word choices would you have wanted to challenge before signing it? Where might the drift have hidden?

Standards alignment

This section maps to four professional frameworks practitioners and councils are accountable to.

BASW Professional Capabilities Framework

Domain 2 · Values and Ethics
Domain 3 · Diversity and Equality
Domain 5 · Knowledge
Domain 6 · Critical Reflection

Social Work England Professional Standards

Standard 1 · Promote rights, strengths and wellbeing
Standard 3 · Be accountable for practice and decisions
Standard 4 · Maintain CPD
Standard 6 · Promote ethical practice and report concerns