snipe/CLAUDE.md
pyr0ball bc930ac1b9 feat(snipe): eBay trust scoring MVP — search, filters, enrichment, comps
Core trust scoring:
- Five metadata signals (account age, feedback count/ratio, price vs market,
  category history), composited 0–100
- CV-based price signal suppression for heterogeneous search results
  (e.g. mixed laptop generations won't false-positive suspicious_price)
- Expanded scratch/dent title detection: evasive redirects, functional problem
  phrases, DIY/repair indicators
- Hard filters: new_account, established_bad_actor
- Soft flags: low_feedback, suspicious_price, duplicate_photo, scratch_dent,
  long_on_market, significant_price_drop

Search & filtering:
- Browse API adapter (up to 200 items/page) + Playwright scraper fallback
- OR-group query expansion for comprehensive variant coverage
- Must-include (AND/ANY/groups), must-exclude, category, price range filters
- Saved searches with full filter round-trip via URL params

Seller enrichment:
- Background BTF /itm/ scraping for account age (Kasada-safe headed Chromium)
- On-demand enrichment: POST /api/enrich + ListingCard ↻ button
- Category history derived from Browse API categories field (free, no extra calls)
- Shopping API GetUserProfile inline enrichment for API adapter

Market comps:
- eBay Marketplace Insights API with Browse API fallback (catches 403 + 404)
- Comps prioritised in ThreadPoolExecutor (submitted first)

Infrastructure:
- Staging DB fields: times_seen, first_seen_at, price_at_first_seen, category_name
- Migrations 004 (staging tracking) + 005 (listing category)
- eBay webhook handler stub
- Cloud compose stack (compose.cloud.yml)
- Vue frontend: search store, saved searches store, ListingCard, filter sidebar

Docs:
- README fully rewritten to reflect MVP status + full feature documentation
- Roadmap table linked to all 13 Forgejo issues
2026-03-26 23:37:09 -07:00

97 lines
5.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Snipe — Developer Context
> eBay listing monitor with seller trust scoring and auction sniping.
## Stack
| Layer | Tech | Port |
|-------|------|------|
| Frontend | Vue 3 + Pinia + UnoCSS + Vite (nginx) | 8509 |
| API | FastAPI (uvicorn, `network_mode: host`) | 8510 |
| Scraper | Playwright + playwright-stealth + Xvfb | — |
| DB | SQLite (`data/snipe.db`) | — |
| Core | circuitforge-core (editable install) | — |
## CLI
```bash
./manage.sh start|stop|restart|status|logs|open|build|test
```
## Docker
```bash
docker compose up -d # start
docker compose build api web # rebuild after Python/Vue changes
docker compose logs -f api # tail API logs
```
`compose.override.yml` bind-mounts `./tests` and `./app` for hot reload.
nginx proxies `/api/``172.17.0.1:8510` (Docker bridge IP — api uses host networking).
## Critical Gotchas
**Kasada bot protection:** eBay blocks `requests`, `curl_cffi`, headless Playwright, and all `/usr/` and `/fdbk/` seller profile pages. Only headed Chromium via Xvfb passes. `/itm/` listing pages DO load and contain a BTF (below the fold) seller card with "Joined {Mon} {Year}" — use this for account age enrichment.
**Xvfb display counter:** Module-level `itertools.cycle(range(200, 300))` issues unique display numbers (`:200``:299`) per `_get()` call to prevent lock file collisions when multiple Playwright sessions run in parallel.
**HTML cache:** 5-minute in-memory cache keyed by full URL. Prevents duplicate 15s Playwright scrapes within one session. Cleared on restart.
**SQLite thread safety:** Each concurrent thread (search + comps run in parallel) must have its own `Store` instance — `sqlite3.connect()` is not thread-safe across threads. See `api/main.py`.
**nginx rebuild gotcha:** nginx config is baked into the image at build time. After editing `docker/web/nginx.conf`, always `docker compose build web`.
**Playwright imports are lazy:** `sync_playwright` and `Stealth` import inside `_get()` — not at module level — so the pure parsing functions (`scrape_listings`, `scrape_sellers`) can be imported on the host without Docker's browser stack installed.
## DB Migrations
Auto-applied by `Store.__init__()` via `circuitforge_core.db.run_migrations`.
Migration files: `app/db/migrations/001_init.sql`, `002_buying_format.sql`, `003_nullable_account_age.sql`
## Tests
```bash
# Host (no Docker needed — pure parsing tests)
conda run -n job-seeker python -m pytest tests/ -v --ignore=tests/test_integration.py
# In container
./manage.sh test
```
48 tests. Scraper tests run on host thanks to lazy Playwright imports.
## Trust Scoring Architecture
```
TrustScorer
├── MetadataScorer → 5 signals × 020 = 0100 composite
│ account_age, feedback_count, feedback_ratio,
│ price_vs_market (vs sold comps), category_history
├── PhotoScorer → phash dedup (free); vision analysis (paid stub)
└── Aggregator → composite score, red flags, hard filters
```
**Red flag sentinel gotcha:** `signal_scores` uses `None` for missing data; `clean` dict substitutes `None → 0` for arithmetic. Always check `signal_scores.get("key")` (not `clean["key"]`) when gating hard-filter flags — otherwise absent data fires false positives.
## Key Files
| File | Purpose |
|------|---------|
| `api/main.py` | FastAPI endpoint — parallel search+comps, serialization |
| `app/platforms/ebay/scraper.py` | Playwright scraper, HTML cache, page parser |
| `app/trust/aggregator.py` | Composite score, red flags, hard filters |
| `app/trust/metadata.py` | 5 metadata signals |
| `app/db/store.py` | SQLite read/write (batch methods) |
| `web/src/views/SearchView.vue` | Filter sidebar + results layout |
| `web/src/stores/search.ts` | Pinia store — API calls, result state |
| `web/src/components/ListingCard.vue` | Listing card + auction dim style |
| `web/src/assets/theme.css` | Central theme (CSS custom properties) |
## Pending Work
- **Seller enrichment** — BTF `/itm/` scrape for `account_age_days` + `_ssn` search page scrape for `category_history_json` both implemented as a combined background daemon thread (`_trigger_scraper_enrichment`). For the API adapter, `enrich_sellers_shopping_api()` fills `account_age_days` inline via Shopping API `GetUserProfile` (app-level Bearer token, no user OAuth). Second search gets full scores. "Jump the queue" on-demand enrichment: `POST /api/enrich` + ↻ button on ListingCard.
- **SSE/WebSocket live score push** — currently enriched data only appears on re-search. Future: background enrichment threads emit events via SSE or WebSocket; frontend updates scores live without re-search. Tracked: Circuit-Forge/snipe#1.
- **"Connect eBay Account" OAuth** — Trading API `GetUser` returns `RegistrationDate` + `sellerFeedbackSummary.feedbackByCategory` for any public seller, but only with a **User Access Token** (OAuth Authorization Code flow). Defer until eBay OAuth is generalized across the menagerie. Tracked: Circuit-Forge/snipe#2.
- **Scammer database + batch eBay Trust & Safety reporting** — local blocklist, batch report deep-links, CF community blocklist (cloud). Tracked: Circuit-Forge/snipe#4.
- **Snipe scheduling** — configurable bid-time offset, human approval gate