Technical AI Safety Project Sprint

Do Frontier AI Models Prefer Elite Names?

A Chilean class-coded surname audit of LLM judgments.

The project tests whether frontier AI models recognize Chilean elite-coded surnames, map them to high-prestige institutions, and use those signals when scoring or shortlisting academic profiles.

Core findings

What we know so far

The study now separates three different things: status knowledge, institution prestige mapping, and decision leakage.

Finding 1

Models know the signal

Chilean Spanish prompts made status recognition much cleaner. Elite-coded surnames were strongly marked as status-associated, while common baseline surnames were mostly marked as no association.

Finding 2

Institution mapping is strong

In probability mapping, elite-coded surnames received 72.59 percent high-prestige institution mass. Common baseline surnames received 55.97 percent.

Finding 3

Decision leakage did not hold

The academic focused replication found almost no score difference. Hidden metadata review also found no stable elite advantage.

Prompts run5,080+
Candidate-level hidden metadata scores6,000
High-prestige mapping gap+16.62
Academic focused score gap+0.002

Experiment timeline

How the study evolved

Each run answered a narrower question. The project moved away from broad claims and toward a clearer separation between learned social maps and actual decision behavior.

Results dashboard

Current charts

These are static summaries of the current runs. They can be updated as new model sweep results arrive.

Institution prestige probability mass

Main positive signal

Academic focused replication

Decision leakage test

Hidden metadata matched effect

File and email metadata

Institution mapping choice split

Inside high-prestige choices

Method

Design principles

The project treats surnames as research probes, not as claims about real people. Common baseline surnames are not treated as lower class names. They are high-frequency Chilean surnames used as a comparison set.

01

Matched and swapped designs

The same profile is tested with elite-coded and common baseline surnames so raw position effects do not become false positives.

02

Chilean Spanish after baseline

Later runs use Chilean Spanish because local language made surname-status recognition clearer.

03

No explanations in outputs

Most prompts request JSON only. That avoids rationalization and keeps the result easier to score.

04

Mechanism before overclaiming

The strongest positive finding is institution mapping. The decision-bias claim remains unsupported by the focused and hidden metadata academic runs.

Surname probes

Name sets used in the current runs

Elite-coded surname probes

Common baseline surname probes

Next update

Model sweep across families

The next run compares the core experiments across additional OpenAI model families, excluding models already used. The point is to see whether smaller, older, reasoning, and newer models differ in status recognition, institution mapping, and hidden decision behavior.

gpt-4.1-nanoo4-minigpt-5.4-nanogpt-5.5