This paper presents the Agent for Anti-Discriminatory Language (AAL), a legally-informed large language model designed to detect discriminatory, stereotypical, and intersectional language in Italian social media. By embedding anti-discrimination legal principles from European and Italian law into the model's behavior—via instruction tuning and expert-guided active learning—we investigate the feasibility of normatively aligned classification in high-stakes digital discourse. Using an interdisciplinary approach, we integrate structured legal knowledge, expert-annotated examples, and staged model feedback. Evaluation results demonstrate high precision in identifying overt hate speech and stereotypes, as well as the ability to generate legal justifications. However, challenges remain in identifying implicit and intersectional bias, particularly when lexical cues are weak or the social context is complex. We discuss the implications for trustworthy AI, calibrating model confidence, and integrating legal reasoning into multilingual generative systems.
Agent for Anti-Discriminatory Language (AAL) A Human-Centered AI Approach to Discriminatory Speech in Italian / M.A. Tamborini, F. Mohammadi, P. Ceravolo, S. Maghool, M.E. D'Amico, C. Siccardi - In: IEEE-CH International Conference on Cyber Humanities[s.l] : Institute of Electrical and Electronics Engineers (IEEE), 2026 Jan 20. - ISBN 979-8-3315-1435-8. - pp. 1-6 (( International Conference on Cyber Humanities (IEEE-CH) Florence 2025 [10.1109/ieee-ch65308.2025.11279699].
Agent for Anti-Discriminatory Language (AAL) A Human-Centered AI Approach to Discriminatory Speech in Italian
M.A. TamboriniPrimo
;F. MohammadiSecondo
;P. Ceravolo;S. Maghool;M.E. D'AmicoPenultimo
;C. SiccardiUltimo
2026
Abstract
This paper presents the Agent for Anti-Discriminatory Language (AAL), a legally-informed large language model designed to detect discriminatory, stereotypical, and intersectional language in Italian social media. By embedding anti-discrimination legal principles from European and Italian law into the model's behavior—via instruction tuning and expert-guided active learning—we investigate the feasibility of normatively aligned classification in high-stakes digital discourse. Using an interdisciplinary approach, we integrate structured legal knowledge, expert-annotated examples, and staged model feedback. Evaluation results demonstrate high precision in identifying overt hate speech and stereotypes, as well as the ability to generate legal justifications. However, challenges remain in identifying implicit and intersectional bias, particularly when lexical cues are weak or the social context is complex. We discuss the implications for trustworthy AI, calibrating model confidence, and integrating legal reasoning into multilingual generative systems.| File | Dimensione | Formato | |
|---|---|---|---|
|
Agent_for_Anti-Discriminatory_Language_AAL_A_Human-Centered_AI_Approach_to_Discriminatory_Speech_in_Italian.pdf
accesso riservato
Tipologia:
Publisher's version/PDF
Licenza:
Nessuna licenza
Dimensione
1.52 MB
Formato
Adobe PDF
|
1.52 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.




