Cloud: shared image hash DB — requires explicit opt-in consent design #7

Open
opened 2026-03-26 22:29:46 -07:00 by pyr0ball · 0 comments
Owner

Background

The phash (perceptual hash) image DB detects duplicate/marketing/stock photos reused by scammers. Currently per-user. Sharing it across cloud users would improve detection quality over time.

Why this needs its own ticket

Unlike seller data (public eBay profiles) or scammer reports (user explicitly acts), adding image fingerprints to a shared pool is a passive data processing activity derived from user search behavior — even though the source images are public eBay listings.

  • A user searching for "laptop" does not implicitly consent to those listing images being fingerprinted into a community corpus
  • GDPR/CCPA require a lawful basis and disclosure even for publicly available data being repurposed into new derived databases
  • Default-off is the correct posture per CF privacy-by-architecture principle
  1. Opt-in toggle in Settings: "Help improve scam detection — contribute image fingerprints to the community database"
  2. Plain-language explanation before enabling:
    • What is stored: a short numeric fingerprint (not the image itself, not your search query)
    • What it is used for: detecting stock photos and duplicate listings reused by scammers
    • What is not stored: the image, the listing URL, your username, or any search history
    • Who can see it: no one — it is used only for automated comparison, never browsed or exported
  3. Consent recorded with timestamp; can be revoked (existing contributed hashes anonymized/deleted on request)
  4. Cloud-only feature — self-hosters control their own DB

Implementation notes

  • SNIPE_CONTRIBUTE_IMAGE_HASHES=true env or per-user DB flag
  • On opt-in: newly computed phashes written to shared.db in addition to per-user DB
  • Existing per-user hashes NOT retroactively shared — only hashes computed after opt-in
  • Consent UI: Settings page expander, same pattern as other tier/feature gates
## Background The phash (perceptual hash) image DB detects duplicate/marketing/stock photos reused by scammers. Currently per-user. Sharing it across cloud users would improve detection quality over time. ## Why this needs its own ticket Unlike seller data (public eBay profiles) or scammer reports (user explicitly acts), adding image fingerprints to a shared pool is a **passive data processing activity derived from user search behavior** — even though the source images are public eBay listings. - A user searching for "laptop" does not implicitly consent to those listing images being fingerprinted into a community corpus - GDPR/CCPA require a lawful basis and disclosure even for publicly available data being repurposed into new derived databases - Default-off is the correct posture per CF privacy-by-architecture principle ## Proposed consent flow 1. Opt-in toggle in Settings: "Help improve scam detection — contribute image fingerprints to the community database" 2. Plain-language explanation before enabling: - What is stored: a short numeric fingerprint (not the image itself, not your search query) - What it is used for: detecting stock photos and duplicate listings reused by scammers - What is not stored: the image, the listing URL, your username, or any search history - Who can see it: no one — it is used only for automated comparison, never browsed or exported 3. Consent recorded with timestamp; can be revoked (existing contributed hashes anonymized/deleted on request) 4. Cloud-only feature — self-hosters control their own DB ## Implementation notes - `SNIPE_CONTRIBUTE_IMAGE_HASHES=true` env or per-user DB flag - On opt-in: newly computed phashes written to `shared.db` in addition to per-user DB - Existing per-user hashes NOT retroactively shared — only hashes computed after opt-in - Consent UI: Settings page expander, same pattern as other tier/feature gates
Sign in to join this conversation.
No labels
backlog
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Circuit-Forge/snipe#7
No description provided.