News

Politics

On HALO

1 hour ago

Anthropic's moral compass architect suggested AI overcorrection could address historical injustices

One of Anthropic’s Artificial Intelligence (AI) philosophy architects argued that intentional discrimination could be a way to combat stigmas on topics of race and gender.In a 2023 paper authored alongside a number of other AI researchers, Amanda Askell, a philosopher hired by Anthropic to develop their AI’s moral compass, argued companies might benefit from a kind of overcorrection toward stereotypes.But, the paper explained, that would require human input on how to modify its answers."Larger models can over-correct, especially as the amount of [human input] training increases. This may be de...