From Vulnerable Data Subjects to Vulnerabilizing Data Practices: Navigating the Protection Paradox in AI-Based Analyses of Platformized Lives

arXiv cs.CV / 4/20/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that “vulnerability” should not be treated as a fixed trait of data subjects, but as something actively produced by data practices within platformized life.
  • It highlights a “protection paradox,” where attempts to protect vulnerable people using data-driven AI can unintentionally increase computational exposure, enable reductionism, and facilitate extraction.
  • Using an AI for Social Good case about using computer vision to quantify child presence in monetized YouTube family vlogs for regulatory advocacy, the authors show how ethical risks emerge from specific pipeline choices.
  • The paper proposes a reflexive ethics protocol covering four pipeline junctures—dataset design, operationalization, inference, and dissemination—and maps ethical tensions to concrete technical questions and prompts.
  • The protocol is organized around four cross-cutting “vulnerabilizing” factors: exposure, monetization, narrative fixing, and algorithmic optimization, guiding researchers toward more ethically robust decisions.

Abstract

This paper traces a conceptual shift from understanding vulnerability as a static, essentialized property of data subjects to examining how it is actively enacted through data practices. Unlike reflexive ethical frameworks focused on missing or counter-data, we address the condition of abundance inherent to platformized life-a context where a near inexhaustible mass of data points already exists, shifting the ethical challenge to the researcher's choices in operating upon this existing mass. We argue that the ethical integrity of data science depends not just on who is studied, but on how technical pipelines transform "vulnerable" individuals into data subjects whose vulnerability can be further precarized. We develop this argument through an AI for Social Good (AI4SG) case: a journalist's request to use computer vision to quantify child presence in monetized YouTube 'family vlogs' for regulatory advocacy. This case reveals a "protection paradox": how data-driven efforts to protect vulnerable subjects can inadvertently impose new forms of computational exposure, reductionism, and extraction. Using this request as a point of departure, we perform a methodological deconstruction of the AI pipeline to show how granular technical decisions are ethically constitutive. We contribute a reflexive ethics protocol that translates these insights into a reflexive roadmap for research ethics surrounding platformized data subjects. Organized around four critical junctures-dataset design, operationalization, inference, and dissemination-the protocol identifies technical questions and ethical tensions where well-intentioned work can slide into renewed extraction or exposure. For every decision point, the protocol offers specific prompts to navigate four cross-cutting vulnerabilizing factors: exposure, monetization, narrative fixing, and algorithmic optimization. Rather than uncritically...

Continue reading this article on the original site.

Read original →