Cloud: shared seller/scammer DB across users (public data, no re-scraping) #6

Open
opened 2026-03-26 22:28:59 -07:00 by pyr0ball · 0 comments
Owner

Problem

In the cloud deployment, each user session operates against an isolated DB. This means seller enrichment (account age, category history) is re-scraped per-user even though the data is entirely public, and the scammer blocklist has no community benefit.

Proposal

Split the DB into two tiers:

Shared DB (single instance, all sessions)

  • sellers — public eBay seller profiles (account age, feedback, category history)
  • scammer_blocklist — community-reported bad actors (consent at report time; user explicitly submits)
  • market_comps — query-keyed median prices (public data, cache shared = fewer Browse API calls)

Per-user DB (isolated per session/account)

  • listings — search results, staging tracking, price history
  • saved_searches — user bookmarks

Privacy posture

  • Shared DB contains only public eBay data + anonymized scammer reports
  • Scammer submissions: user_id (hashed) + seller_platform_id + flag reason — no search history, no PII
  • Consent at report time (user explicitly acts) — no passive data collection
  • CLOUD_DATA_ROOT already exists for per-user data; shared DB at a fixed path (e.g. /devl/snipe-cloud-data/shared.db)

Image hash DB — deliberately excluded

The phash/perceptual hash image DB requires a separate consent design. Even though source images are public eBay listings, adding fingerprints to a shared pool is a data processing activity derived from user search behavior. This requires explicit opt-in ("Help improve scam detection by contributing image fingerprints") with clear disclosure. Tracked separately.

Benefits

  • Seller enrichment amortized across all users — first searcher pays the Playwright cost, everyone after gets it free
  • Community scammer DB grows with usage
  • Fewer Browse API calls for market comps
## Problem In the cloud deployment, each user session operates against an isolated DB. This means seller enrichment (account age, category history) is re-scraped per-user even though the data is entirely public, and the scammer blocklist has no community benefit. ## Proposal Split the DB into two tiers: ### Shared DB (single instance, all sessions) - `sellers` — public eBay seller profiles (account age, feedback, category history) - `scammer_blocklist` — community-reported bad actors (consent at report time; user explicitly submits) - `market_comps` — query-keyed median prices (public data, cache shared = fewer Browse API calls) ### Per-user DB (isolated per session/account) - `listings` — search results, staging tracking, price history - `saved_searches` — user bookmarks ## Privacy posture - Shared DB contains only **public eBay data** + anonymized scammer reports - Scammer submissions: user_id (hashed) + seller_platform_id + flag reason — no search history, no PII - Consent at report time (user explicitly acts) — no passive data collection - `CLOUD_DATA_ROOT` already exists for per-user data; shared DB at a fixed path (e.g. `/devl/snipe-cloud-data/shared.db`) ## Image hash DB — deliberately excluded The phash/perceptual hash image DB requires a separate consent design. Even though source images are public eBay listings, adding fingerprints to a shared pool is a data processing activity derived from user search behavior. This requires explicit opt-in ("Help improve scam detection by contributing image fingerprints") with clear disclosure. Tracked separately. ## Benefits - Seller enrichment amortized across all users — first searcher pays the Playwright cost, everyone after gets it free - Community scammer DB grows with usage - Fewer Browse API calls for market comps
pyr0ball added this to the Public Launch milestone 2026-04-04 16:33:19 -07:00
Sign in to join this conversation.
No labels
backlog
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Circuit-Forge/snipe#6
No description provided.