BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//AI and Equality - ECPv6.13.2.1//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-ORIGINAL-URL:https://aiequalitytoolbox.com
X-WR-CALDESC:Events for AI and Equality
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:Europe/London
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:20260329T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:20261025T010000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=Europe/London:20260625T150000
DTEND;TZID=Europe/London:20260625T160000
DTSTAMP:20260420T192053
CREATED:20260218T104436Z
LAST-MODIFIED:20260218T104436Z
UID:10000031-1782399600-1782403200@aiequalitytoolbox.com
SUMMARY:Testing AI Safety: Why Current Guardrails Fail to Stop Social Bias with Anna-Maria Gueorgiueva | AI & Equality Pub-Talk
DESCRIPTION:Access paper: https://arxiv.org/abs/2512.19238 \nHow do large language models understand the lived experiences of stigmatized groups\, and when does this understanding differ from the human perspective? Can this lead to bias\, and if so\, do our existing safety tools help mitigate such bias? This work investigated open-source language models for bias against 93 stigmatized groups\, identifying that specific types of biases (especially those deemed by humans to be ‘threatening’ such as having HIV or a criminal record) experience significantly more bias than other types of stigmatized identities. To attempt to remedy this\, we test guardrail models\, models from leading technology companies that are meant to identify discriminatory or bias-eliciting inputs and mitigate harmful outputs. This talk will report on our findings\, identifying where existing guardrail models fail and discussing technical and legal solutions. \nAbout the speaker:\nAnna-Maria Gueorguieva is a PhD student at the University of Washington Information School and holds B.A. in Data Science and Legal Studies from UC Berkeley. Her research focuses on AI evaluations for social and political impacts and AI regulation. Her work lies at the intersection of empirical methods to investigate AI usage and behavior in combination with the necessary AI regulations needed to limit and remedy harm.\n\n\n\n\n\n\n\n\n\n\n\nRegister here via our community on Circle
URL:https://aiequalitytoolbox.com/event/testing-ai-safety-why-current-guardrails-fail-to-stop-social-bias-with-anna-maria-gueorgiueva-ai-equality-pub-talk/
ATTACH;FMTTYPE=image/png:https://aiequalitytoolbox.com/wp-content/uploads/2026/02/6.png
END:VEVENT
END:VCALENDAR