feat: contribute seller trust signals to community DB for fine-tuning #32

New issue

Closed

opened 2026-04-12 16:17:38 -07:00 by pyr0ball · 0 comments

pyr0ball commented

2026-04-12 16:17:38 -07:00

Owner

Overview

Feed resolved seller trust outcomes (confirmed scam, confirmed legitimate) from the shared scammer DB into the cf-core community DB as seller_trust signals. These feed a seller trust classifier fine-tuning corpus that improves trust scoring for all Snipe users.

Background

Snipe already maintains a shared scammer DB with seller feature vectors and trust scores. When an outcome is confirmed (user reports scam, or seller is resolved as legitimate over time), that labeled signal is training data. The cf-core community module (Kiwi shared meal plan design, 2026-04-12) provides the CommunitySignal base model; cf-orch handles routing to the fine-tuning queue.

Signal schema (`seller_trust`)

{
  "signal_type": "seller_trust",
  "label": "scam" | "legitimate" | "uncertain",
  "features": {
    "platform": "ebay" | "ct_bids" | str,
    "account_age_days": int,
    "feedback_score": int | None,
    "positive_feedback_pct": float | None,
    "photo_similarity_score": float,   # 0.0–1.0 (stolen photo signal)
    "price_outlier_z": float,           # z-score vs category median
    "listing_count": int,
    "duplicate_listing_count": int,
    "new_account": bool,
    "no_returns_policy": bool,
    "payment_methods": list[str],
  },
  "outcome": "confirmed_scam" | "confirmed_legit" | "no_response" | None,
  "outcome_source": "user_report" | "platform_action" | "resolved_sale" | None
}

No buyer PII. Seller identifier is hashed (platform + seller_id → SHA-256). Feature vectors only.

Contribution triggers

User reports a seller as scam after a transaction
Platform removes a seller listing (scraped signal)
Seller accumulates sufficient confirmed-legitimate transaction history (time-decay threshold)

Scam reports: always contributed (anti-fraud, no opt-out — this is safety infrastructure)
Positive legitimacy signals: opt-in in Settings, off by default
Plain-language disclosure in onboarding: "Snipe pools scam reports across users to improve detection for everyone."

Integration points

CommunitySignal from cf-core.community
Submits to cf-orch POST /ingest/signals
Avocet labeling UI can flag uncertain signals for human review before entering fine-tuning

Acceptance criteria

seller_trust signal constructed and submitted on confirmed scam report
Signal schema validated against CommunitySignal base model (no extra fields)
No seller identifier or buyer PII in transmitted signal
Opt-in toggle for legitimacy signals in Settings
Integration test: report a scam, verify signal in community DB with label=scam
Unit test: signal construction strips identifier, keeps only hashed platform+id

cf-core community module (2026-04-12)
Circuit-Forge/circuitforge-orch — cf-ingest service ticket
Circuit-Forge/avocet — cross-product labeling ticket
Existing shared scammer DB (Snipe shared_store pattern)

## Overview Feed resolved seller trust outcomes (confirmed scam, confirmed legitimate) from the shared scammer DB into the cf-core community DB as `seller_trust` signals. These feed a seller trust classifier fine-tuning corpus that improves trust scoring for all Snipe users. ## Background Snipe already maintains a shared scammer DB with seller feature vectors and trust scores. When an outcome is confirmed (user reports scam, or seller is resolved as legitimate over time), that labeled signal is training data. The cf-core `community` module (Kiwi shared meal plan design, 2026-04-12) provides the `CommunitySignal` base model; cf-orch handles routing to the fine-tuning queue. ## Signal schema (`seller_trust`) ```python { "signal_type": "seller_trust", "label": "scam" | "legitimate" | "uncertain", "features": { "platform": "ebay" | "ct_bids" | str, "account_age_days": int, "feedback_score": int | None, "positive_feedback_pct": float | None, "photo_similarity_score": float, # 0.0–1.0 (stolen photo signal) "price_outlier_z": float, # z-score vs category median "listing_count": int, "duplicate_listing_count": int, "new_account": bool, "no_returns_policy": bool, "payment_methods": list[str], }, "outcome": "confirmed_scam" | "confirmed_legit" | "no_response" | None, "outcome_source": "user_report" | "platform_action" | "resolved_sale" | None } ``` No buyer PII. Seller identifier is hashed (platform + seller_id → SHA-256). Feature vectors only. ## Contribution triggers 1. User reports a seller as scam after a transaction 2. Platform removes a seller listing (scraped signal) 3. Seller accumulates sufficient confirmed-legitimate transaction history (time-decay threshold) ## Tier and consent - Scam reports: **always contributed** (anti-fraud, no opt-out — this is safety infrastructure) - Positive legitimacy signals: **opt-in** in Settings, off by default - Plain-language disclosure in onboarding: "Snipe pools scam reports across users to improve detection for everyone." ## Integration points - `CommunitySignal` from `cf-core.community` - Submits to `cf-orch` POST `/ingest/signals` - Avocet labeling UI can flag uncertain signals for human review before entering fine-tuning ## Acceptance criteria - [ ] `seller_trust` signal constructed and submitted on confirmed scam report - [ ] Signal schema validated against `CommunitySignal` base model (no extra fields) - [ ] No seller identifier or buyer PII in transmitted signal - [ ] Opt-in toggle for legitimacy signals in Settings - [ ] Integration test: report a scam, verify signal in community DB with label=scam - [ ] Unit test: signal construction strips identifier, keeps only hashed platform+id ## Related - cf-core community module (2026-04-12) - Circuit-Forge/circuitforge-orch — cf-ingest service ticket - Circuit-Forge/avocet — cross-product labeling ticket - Existing shared scammer DB (Snipe `shared_store` pattern)