Library

Synthetic data: Potential and Challenges for AI & Equality

Click to read or find it here

About the article

This piece delves into synthetic data’s growing role in AI for promoting equality. While highlighting its advantages like privacy and customization, Kypraiou also underscores challenges like biases and the complexity in creating reliable data. The article stresses the importance of a human rights-based approach to ensure fairness and concludes by emphasizing ethical practices and transparency in utilizing synthetic data for a more equitable future in data science.

About the authors

Sofia Kypraiou, a data scientist with a specialization in ethics and human rights, earned her MSc in Data Science from the École Polytechnique Fédérale de Lausanne (EPFL). She holds a BSc in Computer Science from the National and Kapodistrian University of Athens.

She developed the technical components of the workshop, “<AI & Equality>: A Human Rights Toolbox”, during her MSc thesis at EPFL and works with Women At The Table, in collaboration with the Office of the United Nations High Commissioner for Human Rights (OHCHR). This workshop merges the domains of data science and human rights through a critical analysis methodology.

She has delivered the <AI & Equality>: A Human Rights Toolbox workshop at various universities and art festivals, contributing to the ongoing dialogue at the intersection of these domains.

Recommended resources

→‘Ethical Principles’. UK Statistics Authority, https://uksa.statisticsauthority.gov.uk/the-authority-board/committees/national-statisticians-advisory-committees-and-panels/national-statisticians-data-ethics-advisory-committee/ethical-principles/ . Accessed 9 Oct. 2023.

→ Shumailov, Ilia, et al. The Curse of Recursion: Training on Generated Data Makes Models Forget. arXiv, 31 May 2023. arXiv.org,
https://doi.org/10.48550/arXiv.2305.17493.

→ Talby, David. ‘Council Post: The Dangers Of Using Synthetic Patient Data To Build Healthcare AI Models’. Forbes,
https://www.forbes.com/sites/forbestechcouncil/2023/05/26/the-dangers-of-using-synthetic-patient-data-to-build-healthcare-ai-models/ . Accessed 11 Oct. 2023.

→ Lorenz, P., K. Perset and J. Berryhill (2023), “Initial policy considerations for generative artificial intelligence”, OECD Artificial Intelligence Papers, No. 1, OECD Publishing, Paris, https://doi.org/10.1787/fae2d1e6-en.

→ Belgodere, Brian, et al. Auditing and Generating Synthetic Data with Controllable Trust Trade-Offs. arXiv, 2 May 2023. arXiv.org, https://doi.org/10.48550/arXiv.2304.10819.

→ ‘PAI’s Responsible Practices for Synthetic Media’. Partnership on AI – Synthetic Media, https://syntheticmedia.partnershiponai.org/ . Accessed 11 Oct. 2023.

→ Government of Canada, Statistics Canada. Unlocking the Power of Data Synthesis with the Starter Guide on Synthetic Data for Official Statistics. 1 Mar. 2023, https://www.statcan.gc.ca/en/data-science/network/synthetic-data.

Tools:

→The Synthetic Data Vault. Put Synthetic Data to Work! https://sdv.dev/ . Accessed 11 Oct. 2023.

Join our community

We are committed to advancing human rights-based approaches in AI & encourage anyone interested in learning more from a global perspective to explore and contribute to our community!