Zubnet AILearnWiki › Red Teaming
Safety

Red Teaming

The practice of deliberately trying to make an AI model fail, misbehave, or produce harmful outputs. Red teams probe for vulnerabilities: jailbreaks, bias, misinformation generation, privacy leaks. Named after military wargaming where a "red team" plays the adversary.

Why it matters

You can't fix what you don't know about. Red teaming is how providers discover that their model will explain how to pick locks if you ask it to "write a story about a locksmith." It's essential safety work that happens before every major model release.

Related Concepts

← All Terms
← Recraft Reinforcement Learning →
ESC