AI and Equality

News

Can Bias in LLMs Be Used for Good?

Francesca Lucchini Wortzman shared an idea sparked during the AI & Equality summer school course: using bias in large language models
(LLMs) as a tool for good. By examining biased outputs, researchers could trace them back to the specific documents or patterns in the data that caused them. 

“I want to flip the table,” said Francesca Lucchini Wortzman in her Open Studio presentation for AI & Equality. A machine learning engineer and tech lead at Chile’s national AI research center, CENIA, Francesca shared an idea sparked during the AI & Equality summer school course: using bias in large language models (LLMs) as a tool for good.

Francesca began by acknowledging that bias in LLMs is a real and persistent issue. These models, based on transformer architecture, are trained on massive datasets—like Common Crawl and LAION-5B—that often reflect deep historical and societal inequalities. The result? Biased outputs that reinforce dominant narratives and stereotypes. “As someone who speaks Spanish,” she noted, “I’ve seen how translations into gendered languages reflect traditional gender roles, even when the original text is neutral.”

She pointed out that while current efforts aim to filter or suppress harmful outputs, these solutions often fall short and are easy to bypass. Her proposal: what if we intentionally don’t filter the models? “If we allow LLMs to reflect and amplify the bias in their training data, they become a reflection of the harmful points that data represents,” she explained. “We can query the model to uncover where that data is. If we can find that, we can remove it from the dataset.”

In other words, instead of fighting bias by trying to patch over the surface, Francesca suggests digging deeper. By examining biased outputs, researchers could trace them back to the specific documents or patterns in the data that caused them. Those documents can then be flagged or removed in future training processes, resulting in cleaner, more equitable models.

While this idea is still abstract, Francesca is clear about its potential. “If this works for text, it probably would be able to be extended for images too,” she added, referencing multimodal models like Stable Diffusion, which also suffer from representational biases. “We could use this approach to curate those datasets as well.”

Francesca admitted she’s still figuring out the technical path forward. “It’s not really that concrete yet,” she said. “I was freaking out trying to make this as concrete as possible.” But she’s motivated by the questions and support from the community—and from colleagues at CENIA who are training Spanish-only models and working on underrepresented languages like Mapudungun and Rapa Nui.

About Francesca Lucchini Wortzman:

Francesca Lucchini Wortzman is a Tech Lead at CENIA, the National Artificial Intelligence Research Center. She has a computer science and masters degree from the Pontificia Universidad Católica de Chile, she specializes in machine learning applications related to urban analysis and computer vision. Francesca is passionate about gender equality and applied ethics in AI.

About the author

Amina Soulimani

Programme Associate at Women At The Table and Doctoral Candidate in Anthropology at the University of Cape Town

 

We’re creating a global community that brings together individuals passionate about inclusive, human rights-based AI.

Join our AI & Equality community of students, academics, data scientists and AI practitioners who believe in responsible and fair AI.