Ethical impact | About us

Exploitative labor in dataset training

Behind the polished outputs of AI lies a troubling reality: racialized and exploitative labor. Many AI datasets are refined by workers in the Global South who are paid low wages to sift through disturbing and traumatic content—flagging racist, sexist, and offensive material to make AI “safe” for users. This invisible labor is essential, yet undervalued and often harmful to those performing it.

Moreover, AI training processes can reinforce gendered biases. As revealed in the Excavating AI project by Kate Crawford and Trevor Paglen, many datasets operate on the assumption that only binary gender identities exist, erasing the lived experiences of non-binary and gender-diverse individuals.

Extractive use of user labor and intellectual property

AI systems like ChatGPT rely heavily on user interactions to improve performance. Features like thumbs-up/down feedback are a form of unpaid crowdsourced labor, adding commercial value without compensation. This extractive model benefits corporations, while users unknowingly contribute to product development.

There are also growing concerns around intellectual property and copyright. AI-generated content is often built from existing works—text, images, and ideas—without proper attribution or consent. This raises ethical questions about ownership, originality, and the rights of creators.

Privacy at risk

AI models can inadvertently expose private or sensitive information. Even when not intentionally collected, personal data can enter training sets through data leaks or public posts. As DeepMind researchers note, this can lead to privacy violations with real-world consequences.

Risks include:

Data breaches from vulnerable hosting platforms
Unintended disclosures through misinterpreted inputs
Third-party access via integrated services

Designed for the privileged

AI systems are often built to serve those with the most power and privilege. From the devices they run on to the languages they support, LLMs cater to affluent, tech-savvy users. Meanwhile, the communities most affected by AI’s environmental and social harms are rarely consulted or considered in its design.

Access discrimination: Who gets left behind?

Access to AI is not universal. As LLMs become increasingly privatized and commodified, access is shaped by:

Geopolitical inequalities in internet and device availability
Gender disparities, especially affecting women and girls
Barriers for disabled users, due to poor accessibility design
Censorship in authoritarian regimes
Freemium models, where quality depends on ability to pay

DeepMind, a Google Think Tank, has warned of “disparate access to benefits due to hardware, software, and skill constraints.” As AI becomes a gatekeeper to opportunity, these divides will only deepen.

Feedback loops of inequality

LLMs learn from their users—but if early users are disproportionately privileged, the models will reflect and reinforce those perspectives. This creates a feedback loop where:

Marginalized voices are excluded
Biases are amplified
Barriers to access are reproduced

Without intentional design for equity, AI will continue to reproduce ableist, gendered, genocidal, racist, and classist harms.

Western-centric data = biased knowledge

Most LLMs are trained on western, English-language datasets, embedding dominant cultural norms into their outputs. This leads to:

Erasure of non-western histories and knowledge systems
Reinforcement of colonial, capitalist, and patriarchal ideologies
Recolonization of digital spaces through biased knowledge reproduction

As AI becomes a tool for education, governance, and communication, these biases have real-world consequences.

Toxic norms and malicious uses

LLMs can be used to spread disinformation, facilitate fraud, and amplify harmful stereotypes. DeepMind has identified risks, including:

Exploitation of user trust
Promotion of gender and ethnic stereotypes
Support for surveillance, censorship, and cyberattacks

Even well-intentioned uses can result in harm when models are trained on toxic cultural norms

Systemic harms across society

The ethical risks of LLMs are not abstract—they manifest in:

Public policy shaped by biased data
Education that erases marginalized histories
Employment that automates low-wage jobs
Housing and healthcare decisions mediated by flawed algorithms
Dating and social platforms that reinforce exclusion

These harms are compounding, not isolated. As AI becomes more powerful, so too does its potential to widen existing divides

Toward ethical AI

The ethical impact of AI is a reflection of our cultural values. Will we:

Prioritize profit over people?
Use surveillance and control to manage harm?
Defer responsibility to market forces?

Or will we choose to center justice, equity, and care in how we design and deploy AI?

To build a more just AI future, we must:

Ensure equitable access across geography, gender, and ability
Diversify training data and center marginalized voices
Protect privacy and intellectual property
Hold developers accountable for harm
Resist the commodification of knowledge and labor

AI should not be a tool of oppression—it should be a tool for liberation.

Considerations of Large Language Models

“Some Harm Considerations of Large Language Models (LLMs)” by Rebecca Sweetman is licensed under CC BY-NC-SA 4.0 International.