The challenge of making AI systems behave in ways that match human values and intentions. An aligned model does what you mean, not just what you said — and avoids harmful actions even when not explicitly told not to.
Why it matters
A model that's technically brilliant but poorly aligned is like a genius employee who follows instructions too literally. Alignment research is why models refuse dangerous requests and try to be genuinely helpful.