The Clearview AI story still feels like one of the cleanest examples of the consent gap in applied AI.
The issue is not simply that photos were public. A birthday photo, profile picture, or local event image is posted for a social context. Turning that same image into a biometric lookup system for police is a purpose transformation: different audience, different risk model, different power relationship, and usually no notice or recourse.
A few grounding points:
- The NYT reported in 2020 that Clearview's system was built on more than 3 billion images scraped from Facebook, YouTube, Venmo, and other sites: https://www.nytimes.com/2020/01/18/technology/clearview-privacy-facial-recognition.html
- The Dutch data protection authority fined Clearview in 2024 over an "illegal database" built by automatically harvesting photos and converting them into biometric codes: https://www.forbes.com/sites/roberthart/2024/09/03/clearview-ai-controversial-facial-recognition-firm-fined-33-million-for-illegal-database/
- Later reporting put the database at tens of billions of images and described law-enforcement use at large scale: https://www.businessinsider.com/clearview-scraped-30-billion-images-facebook-police-facial-recogntion-database-2023-4
The engineering question I keep coming back to: should "publicly accessible" ever be treated as blanket permission to create biometric infrastructure?
My instinct is no. At minimum, this class of system needs product and legal boundaries around:
- purpose limitation: social publication should not silently become identity search
- auditability: every search should be logged, reviewable, and tied to a lawful process
- dataset provenance: operators should be able to prove where biometric templates came from
- deletion and appeal: people need a way to challenge inclusion and misuse
- scope limits: investigative convenience is not the same as democratic authorization
Curious where people draw the line. Is the right boundary at scraping, biometric conversion, commercial sale, law-enforcement access, or some combination of all four?
[link] [comments]



