Compare commits
3 commits
89d3862f62
...
d7c8a8bca6
| Author | SHA1 | Date | |
|---|---|---|---|
| d7c8a8bca6 | |||
| 108f63b4f2 | |||
| bccedb1fe5 |
7 changed files with 403 additions and 543 deletions
358
README.md
358
README.md
|
|
@ -1,29 +1,81 @@
|
||||||
# Snipe — Auction Sniping & Listing Intelligence
|
<!-- Logo coming soon — replace docs/snipe-logo.svg when final icon ships -->
|
||||||
|
<div align="center">
|
||||||
|
<img src="docs/snipe-logo.svg" alt="Snipe logo" width="120" />
|
||||||
|
|
||||||
> *Part of the Circuit Forge LLC "AI for the tasks you hate most" suite.*
|
# Snipe
|
||||||
|
|
||||||
**Status:** Active — eBay listing intelligence MVP complete; Mercari search + trust scoring live. Auction sniping engine and additional platforms are next.
|
**Auction intelligence and sniping for people who don't trust the platform.**
|
||||||
|
|
||||||
**[Documentation](https://docs.circuitforge.tech/snipe/)** · [circuitforge.tech](https://circuitforge.tech)
|
[](LICENSE)
|
||||||
|
[]()
|
||||||
|
[](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe)
|
||||||
|
[](https://docs.circuitforge.tech/snipe)
|
||||||
|
|
||||||
## Quick install (self-hosted)
|
*Part of the Circuit Forge LLC suite — "AI for the tasks the system made hard on purpose."*
|
||||||
|
</div>
|
||||||
|
|
||||||
**Requirements:** Docker with Compose plugin, Git. No API keys needed to get started.
|
---
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Why Snipe?
|
||||||
|
|
||||||
|
Auction platforms are designed to make you act fast and trust blindly. The closing countdown, the hidden price history, the new-account seller with one feedback — all of it is structured against the buyer.
|
||||||
|
|
||||||
|
Snipe inverts that. Before you place a bid, you get a trust score built from five independently sourced signals: seller account age, feedback volume, feedback ratio, price versus recent completed sales, and category history. A hard-coded red flag for new accounts or bad actors overrides the composite. Soft flags surface buried damage disclosures, duplicate photos, and listings that have been sitting unsold for weeks. When the listing is priced well below market, you see a STEAL badge — sourced from eBay Marketplace Insights, not from the seller's description.
|
||||||
|
|
||||||
|
The sniping engine — precise last-second bid submission with NTP (network time protocol) synchronization and soft-close handling — is next on the roadmap. The intelligence layer is live now.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
### Listing intelligence (live)
|
||||||
|
|
||||||
|
- **Trust scoring** — five-signal composite score (0–100) per listing: account age, feedback count, feedback ratio, price vs. market, category history
|
||||||
|
- **Red flag detection** — hard flags for new accounts and established bad actors; soft flags for damage keywords, evasive language, duplicate photos, long-on-market listings, and significant price drops
|
||||||
|
- **Price vs. market** — listing price compared against completed-sale medians via eBay Marketplace Insights API (Browse API fallback)
|
||||||
|
- **Keyword filtering** — must-include (AND / ANY / OR-groups), must-exclude, category, price range; OR-groups expand into multiple targeted queries so eBay relevance doesn't silently drop variants
|
||||||
|
- **Saved searches** — one-click re-run that restores all filter settings
|
||||||
|
- **Background enrichment** — seller account age scraped via Playwright + Xvfb (Kasada/Cloudflare-safe headed Chromium); on-demand re-score per listing without re-searching
|
||||||
|
- **LLM query builder** — describe what you want in plain language; an LLM builds the search terms (paid tier)
|
||||||
|
- **Vision photo assessment** — condition scoring from listing photos via moondream2 locally or Claude vision (paid/cloud); VRAM-aware scheduling via circuitforge-core task scheduler
|
||||||
|
- **Affiliate link builder** — eBay Partner Network wrapping with user BYOK support and per-retailer disclosure
|
||||||
|
|
||||||
|
### Platforms
|
||||||
|
|
||||||
|
| Platform | Search | Trust scoring | Completed-sale comps |
|
||||||
|
|----------|--------|---------------|----------------------|
|
||||||
|
| **eBay** | Browse API + Playwright fallback | All 5 signals | Marketplace Insights + Browse fallback |
|
||||||
|
| **Mercari** | Playwright scraper | 3/5 signals (partial) | Phase 3 |
|
||||||
|
| CT Bids, HiBid, AuctionZip, Invaluable, GovPlanet, Bidsquare, Proxibid | Planned | Planned | Planned |
|
||||||
|
|
||||||
|
### Auction sniping engine (roadmap)
|
||||||
|
|
||||||
|
- NTP-synchronized last-second bid submission
|
||||||
|
- Soft-close detection and strategy adjustment
|
||||||
|
- Proxy bid ladder with configurable max
|
||||||
|
- Human approval gate before any bid executes
|
||||||
|
- Post-win workflow: payment routing, shipping coordination, provenance documentation
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
**Requirements:** Docker with Compose plugin, Git. No API keys required to get started.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# One-line install — clones to ~/snipe by default
|
# One-line install — clones to ~/snipe by default
|
||||||
bash <(curl -fsSL https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/raw/branch/main/install.sh)
|
bash <(curl -fsSL https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/raw/branch/main/install.sh)
|
||||||
|
|
||||||
# Or clone manually and run the script:
|
|
||||||
git clone https://git.opensourcesolarpunk.com/Circuit-Forge/snipe.git
|
|
||||||
bash snipe/install.sh
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Then open **http://localhost:8509**.
|
Then open **http://localhost:8509**.
|
||||||
|
|
||||||
### Manual setup (if you prefer)
|
### Manual setup
|
||||||
|
|
||||||
Snipe's API image is built from a parent context that includes `circuitforge-core`. Both repos must sit as siblings in the same directory:
|
Snipe's API image builds from a parent context that includes `circuitforge-core`. Both repos must sit as siblings:
|
||||||
|
|
||||||
```
|
```
|
||||||
workspace/
|
workspace/
|
||||||
|
|
@ -36,286 +88,86 @@ mkdir snipe-workspace && cd snipe-workspace
|
||||||
git clone https://git.opensourcesolarpunk.com/Circuit-Forge/snipe.git
|
git clone https://git.opensourcesolarpunk.com/Circuit-Forge/snipe.git
|
||||||
git clone https://git.opensourcesolarpunk.com/Circuit-Forge/circuitforge-core.git
|
git clone https://git.opensourcesolarpunk.com/Circuit-Forge/circuitforge-core.git
|
||||||
cd snipe
|
cd snipe
|
||||||
cp .env.example .env # edit if you have eBay API credentials (optional)
|
cp .env.example .env # add eBay API credentials if you have them (optional)
|
||||||
./manage.sh start
|
./manage.sh start
|
||||||
```
|
```
|
||||||
|
|
||||||
### Optional: eBay API credentials
|
### Optional: eBay API credentials
|
||||||
|
|
||||||
Snipe works without any credentials using its Playwright scraper fallback. Adding eBay API credentials unlocks faster searches and inline seller account age (no extra scrape needed):
|
Snipe works without credentials using its Playwright scraper fallback. Adding credentials unlocks faster searches and inline seller account age without an extra scrape:
|
||||||
|
|
||||||
1. Register at [developer.ebay.com](https://developer.ebay.com/my/keys)
|
1. Register at [developer.ebay.com](https://developer.ebay.com/my/keys)
|
||||||
2. Copy your Production **App ID** and **Cert ID** into `.env`
|
2. Copy your Production **App ID** and **Cert ID** into `.env`
|
||||||
3. Restart: `./manage.sh restart`
|
3. `./manage.sh restart`
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## What it does
|
## Tiers
|
||||||
|
|
||||||
Snipe has two layers that work together:
|
| Tier | What you get |
|
||||||
|
|------|-------------|
|
||||||
|
| **Free** | eBay + Mercari search, full trust scoring, keyword filtering, saved searches — local LLM only |
|
||||||
|
| **Paid** | LLM query builder, background saved-search monitoring with alerts, cloud LLM option |
|
||||||
|
| **Premium** | Vision photo condition assessment, fine-tuned trust models, multi-user |
|
||||||
|
| **Ultra** | Human-in-the-loop operator — handles CAPTCHAs, phone calls, anything automation can't |
|
||||||
|
|
||||||
**Layer 1 — Listing intelligence (MVP, implemented)**
|
License key format: `CFG-SNPE-XXXX-XXXX-XXXX`
|
||||||
Before you bid, Snipe tells you whether a listing is worth your time. It fetches eBay listings, scores each seller's trustworthiness across five signals, flags suspicious pricing relative to completed sales, and surfaces red flags like new accounts, cosmetic damage buried in titles, and listings that have been sitting unsold for weeks.
|
|
||||||
|
|
||||||
**Layer 2 — Auction sniping (roadmap)**
|
|
||||||
Snipe manages the bid itself: monitors listings across platforms, schedules last-second bids, handles soft-close extensions, and guides you through the post-win logistics (payment routing, shipping coordination, provenance documentation for antiques).
|
|
||||||
|
|
||||||
The name is the origin of the word "sniping" — common snipes are notoriously elusive birds, secretive and camouflaged, that flush suddenly from cover. Shooting one required extreme patience, stillness, and a precise last-second shot. That's the auction strategy.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Screenshots
|
|
||||||
|
|
||||||
**Landing page — no account required**
|
|
||||||

|
|
||||||
|
|
||||||
**Search results with trust scores**
|
|
||||||

|
|
||||||
|
|
||||||
**STEAL badge — price significantly below market**
|
|
||||||

|
|
||||||
|
|
||||||
> Red flag and Triple Red screenshots coming — captured opportunistically from real scammy listings.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Implemented: Listing Intelligence
|
|
||||||
|
|
||||||
### Supported platforms
|
|
||||||
|
|
||||||
| Platform | Search | Trust scoring | Completed-sales comps |
|
|
||||||
|----------|--------|---------------|-----------------------|
|
|
||||||
| **eBay** | ✅ Browse API + Playwright fallback | ✅ All 5 signals | ✅ Marketplace Insights + Browse fallback |
|
|
||||||
| **Mercari** | ✅ Playwright scraper | ✅ Partial (3/5 signals) | ⏳ Phase 3 |
|
|
||||||
|
|
||||||
Switch between platforms via the tab picker in the search UI. All platforms share the same Playwright + Xvfb scraping stack (Cloudflare/Kasada-safe headed Chromium).
|
|
||||||
|
|
||||||
### eBay Listing Intelligence
|
|
||||||
|
|
||||||
### Search & filtering
|
|
||||||
- Full-text eBay search via Browse API (with Playwright scraper fallback when no API credentials configured)
|
|
||||||
- Price range, must-include keywords (AND / ANY / OR-groups mode), must-exclude terms, eBay category filter
|
|
||||||
- OR-group mode expands keyword combinations into multiple targeted queries and deduplicates results — eBay relevance won't silently drop variants
|
|
||||||
- Pages-to-fetch control: each Browse API page returns up to 200 listings
|
|
||||||
- Saved searches with one-click re-run that restores all filter settings
|
|
||||||
|
|
||||||
### Seller trust scoring
|
|
||||||
Five signals, each scored 0–20, composited to 0–100:
|
|
||||||
|
|
||||||
| Signal | What it measures |
|
|
||||||
|--------|-----------------|
|
|
||||||
| `account_age` | Days since eBay account registration |
|
|
||||||
| `feedback_count` | Total feedback received |
|
|
||||||
| `feedback_ratio` | Positive feedback percentage |
|
|
||||||
| `price_vs_market` | Listing price vs. median of recent completed sales |
|
|
||||||
| `category_history` | Whether seller has history selling in this category |
|
|
||||||
|
|
||||||
Scores are marked **partial** when signals are unavailable (e.g. account age not yet enriched). Partial scores are displayed with a visual indicator rather than penalizing the seller for missing data.
|
|
||||||
|
|
||||||
### Red flags
|
|
||||||
Hard filters that override the composite score:
|
|
||||||
- `new_account` — account registered within 7 days
|
|
||||||
- `established_bad_actor` — feedback ratio < 80% with 20+ reviews
|
|
||||||
|
|
||||||
Soft flags surfaced as warnings:
|
|
||||||
- `account_under_30_days` — account under 30 days old
|
|
||||||
- `low_feedback_count` — fewer than 10 reviews
|
|
||||||
- `suspicious_price` — listing price below 50% of market median *(suppressed automatically when the search returns a heterogeneous price distribution — e.g. mixed laptop generations — to prevent false positives)*
|
|
||||||
- `duplicate_photo` — same image found on another listing (perceptual hash)
|
|
||||||
- `scratch_dent_mentioned` — title keywords indicating cosmetic damage, functional problems, or evasive language (see below)
|
|
||||||
- `long_on_market` — listing has been seen 5+ times over 14+ days without selling
|
|
||||||
- `significant_price_drop` — current price more than 20% below first-seen price
|
|
||||||
|
|
||||||
### Scratch & dent title detection
|
|
||||||
Scans listing titles for signals the item may have undisclosed damage or problems:
|
|
||||||
- **Explicit damage**: scratch, scuff, dent, crack, chip, blemish, worn
|
|
||||||
- **Condition catch-alls**: as is, for parts, parts only, spares or repair
|
|
||||||
- **Evasive redirects**: "see description", "read description", "see photos for" (seller hiding damage detail in listing body)
|
|
||||||
- **Functional problems**: "not working", "stopped working", "no power", "dead on arrival", "powers on but", "faulty", "broken screen/hinge/port"
|
|
||||||
- **DIY/repair listings**: "needs repair", "needs tlc", "project laptop", "for repair", "sold as is"
|
|
||||||
|
|
||||||
### Seller enrichment
|
|
||||||
- **Inline (API adapter)**: account age filled from Browse API `registrationDate` field
|
|
||||||
- **Background (scraper)**: `/itm/` listing pages scraped for seller "Joined" date via Playwright + Xvfb (Kasada-safe headed Chromium)
|
|
||||||
- **On-demand**: ↻ button on any listing card triggers `POST /api/enrich` — runs enrichment and re-scores without waiting for a second search
|
|
||||||
- **Category history**: derived from the seller's accumulated listing data (Browse API `categories` field); improves with every search, no extra API calls
|
|
||||||
|
|
||||||
### Affiliate link builder
|
|
||||||
|
|
||||||
Listing cards surface eBay affiliate-wrapped URLs. Uses `circuitforge_core.affiliates.wrap_url` — resolution order: user opted out → plain URL; user has BYOK affiliate ID → their ID; CF env var set (`EBAY_AFFILIATE_ID`) → CF's ID; otherwise plain URL. Users can configure their own eBay Partner Network ID or opt out entirely in Settings.
|
|
||||||
|
|
||||||
Disclosure tooltip appears on first encounter per-session and on each wrapped link (per-retailer copy from `get_disclosure_text`).
|
|
||||||
|
|
||||||
### Feedback FAB
|
|
||||||
|
|
||||||
In-app feedback button (bottom-right FAB) opens a modal: title, description, optional screenshot. Posts to the CF feedback endpoint. Status probed on load; FAB hidden if endpoint unreachable.
|
|
||||||
|
|
||||||
### Vision task scheduling
|
|
||||||
|
|
||||||
Photo condition assessment tasks queued through `circuitforge_core.tasks.TaskScheduler` — VRAM-aware slot management shared with any other LLM workloads on the same host. Runs moondream2 locally (free tier) or Claude vision (paid/cloud). Results stored per-listing and update the trust score card.
|
|
||||||
|
|
||||||
### Market price comparison
|
|
||||||
Completed sales fetched via eBay Marketplace Insights API (with Browse API fallback for app tiers that don't have Insights access). Median stored per query hash, used to score `price_vs_market` across all listings in a search.
|
|
||||||
|
|
||||||
### Adapters
|
|
||||||
| Adapter | When used | Signals available |
|
|
||||||
|---------|-----------|-------------------|
|
|
||||||
| Browse API (`api`) | eBay API credentials configured | All signals; account age inline |
|
|
||||||
| Playwright scraper (`scraper`) | No credentials / forced | All signals except account age (async BTF enrichment) |
|
|
||||||
| `auto` (default) | — | API if credentials present, scraper otherwise |
|
|
||||||
|
|
||||||
### Mercari Listing Intelligence
|
|
||||||
|
|
||||||
Search Mercari US via headed Chromium + playwright-stealth, bypassing Cloudflare Turnstile. Uses the same `BrowserPool` as the eBay scraper.
|
|
||||||
|
|
||||||
**Trust signal coverage:**
|
|
||||||
|
|
||||||
| Signal | Source | Available |
|
|
||||||
|--------|--------|-----------|
|
|
||||||
| `feedback_count` | `NumSales` on listing page | ✅ |
|
|
||||||
| `feedback_ratio` | `ReviewStarsWrapper[data-stars]` ÷ 5 | ✅ |
|
|
||||||
| `price_vs_market` | Computed from comps (Phase 3) | ⏳ |
|
|
||||||
| `account_age_days` | Seller profile page (not yet fetched) | ❌ |
|
|
||||||
| `category_history` | Not exposed in Mercari HTML | ❌ |
|
|
||||||
|
|
||||||
All Mercari scores are marked **partial** (`score_is_partial=True`) because account age and category history are unavailable. The trust scorer handles partial scores correctly — missing signals don't penalise the seller.
|
|
||||||
|
|
||||||
**Design note:** `seller_platform_id` stores the Mercari `product_id` (e.g. `m86032668393`) rather than the seller username, because seller identity isn't available from search results HTML. `get_seller()` resolves the product ID by fetching the individual listing page.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Stack
|
|
||||||
|
|
||||||
| Layer | Tech | Port |
|
|
||||||
|-------|------|------|
|
|
||||||
| Frontend | Vue 3 + Pinia + UnoCSS + Vite (nginx) | 8509 |
|
|
||||||
| API | FastAPI (uvicorn) | 8510 |
|
|
||||||
| Scraper | Playwright + playwright-stealth + Xvfb | — |
|
|
||||||
| DB | SQLite (`data/snipe.db`) | — |
|
|
||||||
| Core | circuitforge-core (editable install) | — |
|
|
||||||
|
|
||||||
## Running
|
## Running
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
./manage.sh start # start all services
|
./manage.sh start # start all services
|
||||||
./manage.sh stop # stop
|
./manage.sh stop # stop
|
||||||
|
./manage.sh restart # restart
|
||||||
./manage.sh logs # tail logs
|
./manage.sh logs # tail logs
|
||||||
./manage.sh open # open in browser
|
./manage.sh open # open in browser
|
||||||
```
|
```
|
||||||
|
|
||||||
Cloud stack (shared DB, multi-user):
|
---
|
||||||
```bash
|
|
||||||
docker compose -f compose.cloud.yml -p snipe-cloud up -d
|
## Stack
|
||||||
docker compose -f compose.cloud.yml -p snipe-cloud build api # after Python changes
|
|
||||||
```
|
| Layer | Technology | Port |
|
||||||
|
|-------|-----------|------|
|
||||||
|
| Frontend | Vue 3 + Pinia + UnoCSS + Vite (served via nginx) | 8509 |
|
||||||
|
| API | FastAPI (uvicorn) | 8510 |
|
||||||
|
| Scraper | Playwright + playwright-stealth + Xvfb (Kasada/Cloudflare-safe headed Chromium) | — |
|
||||||
|
| Database | SQLite (`data/snipe.db`) | — |
|
||||||
|
| Core | circuitforge-core (editable install) | — |
|
||||||
|
|
||||||
|
The scraper stack uses headed Chromium via Xvfb (X virtual framebuffer) with playwright-stealth for all platform access. Headless and `requests`-based approaches are blocked by eBay and Mercari.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Roadmap
|
## Documentation
|
||||||
|
|
||||||
### Intelligence features
|
Full documentation at **[docs.circuitforge.tech/snipe](https://docs.circuitforge.tech/snipe)** — setup guide, trust scoring algorithm, platform adapter reference, API docs, and self-hosting notes.
|
||||||
|
|
||||||
| Issue | Feature |
|
|
||||||
|-------|---------|
|
|
||||||
| [#5](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/5) | UPC/product lookup → LLM-crafted search terms (paid tier) |
|
|
||||||
| [#12](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/12) | Background saved-search monitoring with configurable alerts |
|
|
||||||
| [#21](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/21) | Vision classification pipeline — condition scoring, listing quality, fraud signals |
|
|
||||||
| [#43](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/43) | Wire photo analysis task to cf-orch (VRAM-aware scheduling) |
|
|
||||||
| [#51](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/51) | Reranker: semantic filter before trust scoring |
|
|
||||||
| [#52](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/52) | Trust score fix: exclude buyer-only feedback from `feedback_count` |
|
|
||||||
| [#41](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/41) | Additional theme variants — solarized, high-contrast, colorblind-safe |
|
|
||||||
|
|
||||||
### Platform expansion
|
|
||||||
|
|
||||||
| Issue | Feature |
|
|
||||||
|-------|---------|
|
|
||||||
| ✅ shipped | Mercari US — search + partial trust scoring |
|
|
||||||
| [#53](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/53) | BrowserPool thread-safety — eliminate per-request cold-start (~10s) |
|
|
||||||
| [#10](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/10) | CT Bids, HiBid, AuctionZip, Invaluable, GovPlanet, Bidsquare, Proxibid |
|
|
||||||
| [#46](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/46) | Broadcast trust score verdicts to Fediverse communities via ActivityPub |
|
|
||||||
|
|
||||||
### Cloud / infrastructure
|
|
||||||
|
|
||||||
| Issue | Feature |
|
|
||||||
|-------|---------|
|
|
||||||
| [#7](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/7) | Shared image hash DB — requires explicit opt-in consent (CF privacy-by-architecture) |
|
|
||||||
| [#45](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/45) | Migrate shared seller/comps DB from SQLite to Postgres |
|
|
||||||
|
|
||||||
### Auction sniping engine
|
|
||||||
|
|
||||||
| Issue | Feature |
|
|
||||||
|-------|---------|
|
|
||||||
| [#9](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/9) | Bid scheduling + snipe execution (NTP-synchronized, soft-close handling, human approval gate) |
|
|
||||||
| [#13](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/13) | Post-win workflow: payment routing, shipping coordination, provenance documentation |
|
|
||||||
|
|
||||||
### Already shipped
|
|
||||||
|
|
||||||
| Issue | Feature |
|
|
||||||
|-------|---------|
|
|
||||||
| [#1](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/1) | SSE live score push — enriched data appears without re-search |
|
|
||||||
| [#2](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/2) | eBay OAuth for full trust score access via Trading API |
|
|
||||||
| [#4](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/4) | Community blocklist + batch eBay Trust & Safety reporting |
|
|
||||||
| [#6](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/6) | Shared seller/scammer/comps DB across cloud users |
|
|
||||||
| [#8](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/8) | "Triple Red" easter egg |
|
|
||||||
| [#11](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/11) | Vision-based photo condition assessment — moondream2 / Claude vision |
|
|
||||||
| [#27](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/27) | MCP server for Snipe search and scoring |
|
|
||||||
| [#29](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/29) | LLM query builder — describe what to find, AI builds the search |
|
|
||||||
| [#47](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/47) | Browser pool — pre-warm Chromium to cut scrape cold-start |
|
|
||||||
| [#48](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/48) | Search result caching — skip redundant scrapes for repeated queries |
|
|
||||||
| [#49](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/49) | Async search endpoint — return job ID immediately, scrape in background |
|
|
||||||
| [#50](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe/issues/50) | Currency preference — display prices in user's preferred currency |
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Primary platforms (full vision)
|
## Forgejo-primary
|
||||||
|
|
||||||
- **eBay** — general + collectibles *(search + trust scoring: implemented)*
|
Snipe is developed and maintained on Forgejo at [git.opensourcesolarpunk.com/Circuit-Forge/snipe](https://git.opensourcesolarpunk.com/Circuit-Forge/snipe). GitHub and Codeberg are read-only mirrors. File issues and submit pull requests on Forgejo.
|
||||||
- **Mercari** — US resale marketplace *(search + partial trust scoring: implemented; comps Phase 3)*
|
|
||||||
- **CT Bids** — Connecticut state surplus and municipal auctions
|
|
||||||
- **GovPlanet / IronPlanet** — government surplus equipment
|
|
||||||
- **AuctionZip** — antique auction house aggregator (1,000+ houses)
|
|
||||||
- **Invaluable / LiveAuctioneers** — fine art and antiques
|
|
||||||
- **Bidsquare** — antiques and collectibles
|
|
||||||
- **HiBid** — estate auctions
|
|
||||||
- **Proxibid** — industrial and collector auctions
|
|
||||||
|
|
||||||
## Why auctions are hard
|
---
|
||||||
|
|
||||||
Online auctions are frustrating because:
|
## Contributing
|
||||||
- Winning requires being present at the exact closing moment — sometimes 2 AM
|
|
||||||
- Platforms vary wildly: some allow proxy bids, some don't; closing times extend on activity
|
|
||||||
- Scammers exploit auction urgency — new accounts, stolen photos, pressure to pay outside platform
|
|
||||||
- Price history is hidden — you don't know if an item is underpriced or a trap
|
|
||||||
- Sellers hide damage in descriptions rather than titles to avoid automated filters
|
|
||||||
- Shipping logistics for large / fragile antiques require coordination with the auction house
|
|
||||||
- Provenance documentation is inconsistent across auction houses
|
|
||||||
|
|
||||||
## Bidding strategy engine (planned)
|
Bug reports and feature requests: open an issue on Forgejo. The discovery pipeline (scrapers, adapters, signal extraction) is MIT-licensed — pull requests welcome. AI trust-scoring features are BSL 1.1 — contributions are accepted but the license terms apply.
|
||||||
|
|
||||||
- **Hard snipe**: submit bid N seconds before close (default: 8s)
|
---
|
||||||
- **Soft-close handling**: detect if platform extends on last-minute bids; adjust strategy
|
|
||||||
- **Proxy ladder**: set max and let the engine bid in increments, reserve snipe for final window
|
|
||||||
- **Reserve detection**: identify likely reserve price from bid history patterns
|
|
||||||
- **Comparable sales**: pull recent auction results for same/similar items across platforms
|
|
||||||
|
|
||||||
## Post-win workflow (planned)
|
## License
|
||||||
|
|
||||||
1. Payment method routing (platform-specific: CC, wire, check)
|
Snipe uses a dual license:
|
||||||
2. Shipping quote requests to approved carriers (freight / large items via uShip; parcel via FedEx/UPS)
|
|
||||||
3. Condition report request from auction house
|
|
||||||
4. Provenance packet generation (for antiques / fine art resale or insurance)
|
|
||||||
5. Add to inventory (for dealers / collectors tracking portfolio value)
|
|
||||||
|
|
||||||
## Product code (license key)
|
| Component | License |
|
||||||
|
|-----------|---------|
|
||||||
|
| Discovery pipeline — scrapers, platform adapters, search, keyword filtering | [MIT](LICENSE-MIT) |
|
||||||
|
| LLM trust-scoring, query builder, vision assessment, AI features | [BSL 1.1](LICENSE-BSL) — free for personal non-commercial self-hosting; commercial use requires a paid license; converts to MIT after 4 years |
|
||||||
|
|
||||||
`CFG-SNPE-XXXX-XXXX-XXXX`
|
Privacy · Safety · Accessibility — co-equal, non-negotiable.
|
||||||
|
|
||||||
## Tech notes
|
[circuitforge.tech](https://circuitforge.tech)
|
||||||
|
|
||||||
- Shared `circuitforge-core` scaffold (DB, LLM router, tier system, config)
|
|
||||||
- Platform adapters: eBay (Browse API + scraper) and Mercari (scraper); AuctionZip, Invaluable, HiBid, CT Bids planned (Playwright + API where available)
|
|
||||||
- Bid execution: Playwright automation with precise timing (NTP-synchronized)
|
|
||||||
- Soft-close detection: platform-specific rules engine
|
|
||||||
- Comparable sales: eBay completed listings via Marketplace Insights API + Browse API fallback
|
|
||||||
- Vision module: condition assessment from listing photos — moondream2 / Claude vision (paid tier stub in `app/trust/photo.py`)
|
|
||||||
- **Kasada/Cloudflare bypass**: headed Chromium via Xvfb with playwright-stealth; all scraping uses this path — headless and `requests`-based approaches are blocked by eBay and Mercari. Xvfb started with `-ac` (no X11 auth required in Docker), display range `:200+` to avoid host socket conflicts.
|
|
||||||
|
|
|
||||||
|
|
@ -1,60 +1,58 @@
|
||||||
"""Pre-warmed Chromium browser pool for the eBay scraper.
|
"""Thread-local Playwright browser manager for the eBay scraper.
|
||||||
|
|
||||||
Eliminates cold-start latency (5-10s per call) by keeping a small pool of
|
Each uvicorn worker thread that calls fetch_html() gets its own Playwright
|
||||||
long-lived Playwright browser instances with fresh contexts ready to serve.
|
instance, browser, and context — created lazily on first use. This avoids
|
||||||
|
the "cannot switch to a different thread" error that arises when Playwright
|
||||||
|
sync API instances are shared across threads (they bind their greenlet event
|
||||||
|
loop to the creating thread).
|
||||||
|
|
||||||
Key design:
|
Key design:
|
||||||
- Pool slots: ``(xvfb_proc, pw_instance, browser, context, display_num, last_used_ts)``
|
- Thread-local: _thread_local.slot holds the _PooledBrowser for the current
|
||||||
One headed Chromium browser per slot — keeps the Kasada fingerprint clean.
|
thread. No slot is ever handed to another thread.
|
||||||
- Display numbering: :200-:399 (avoids host :0 and low-numbered kernel socket conflicts).
|
- Lazy creation: slots are created on first fetch_html() call per thread, not
|
||||||
- Thread safety: ``queue.Queue`` with blocking get (timeout=3s before fresh fallback).
|
at startup. start() is a lightweight lifecycle marker only.
|
||||||
- Replenishment: after each use, the dirty context is closed and a new context is
|
- Registry: _slot_registry (keyed by thread-id) lets stop() close every active
|
||||||
opened on the *same* browser, then returned to the queue. Browser launch overhead
|
slot across all threads without walking thread-local storage.
|
||||||
is only paid at startup and during idle-cleanup replenishment.
|
- Replenishment: after each use the dirty context is closed and a fresh one
|
||||||
- Idle cleanup: daemon thread closes slots idle for >5 minutes to avoid memory leaks
|
opened on the same browser. Browser launch overhead is paid at most once
|
||||||
when the service is quiet.
|
per worker thread lifetime.
|
||||||
- Graceful degradation: if Playwright / Xvfb is unavailable (host-side test env),
|
- Graceful degradation: if Playwright / Xvfb is unavailable, fetch_html falls
|
||||||
``fetch_html`` falls back to launching a fresh browser per call — same behavior
|
back to _fetch_fresh (identical behavior to before this module existed).
|
||||||
as before this module existed.
|
|
||||||
|
|
||||||
Pool size is controlled via ``BROWSER_POOL_SIZE`` env var (default: 2).
|
Pool size is read from BROWSER_POOL_SIZE env var (default: 2) but is now a
|
||||||
|
soft limit — used only for documentation; actual concurrency is bounded by
|
||||||
|
uvicorn's thread count.
|
||||||
"""
|
"""
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import itertools
|
import itertools
|
||||||
import logging
|
import logging
|
||||||
import os
|
import os
|
||||||
import queue
|
|
||||||
import subprocess
|
import subprocess
|
||||||
import threading
|
import threading
|
||||||
import time
|
import time
|
||||||
from concurrent.futures import ThreadPoolExecutor, as_completed
|
|
||||||
from dataclasses import dataclass, field
|
from dataclasses import dataclass, field
|
||||||
from typing import Optional
|
from typing import Optional
|
||||||
|
|
||||||
log = logging.getLogger(__name__)
|
log = logging.getLogger(__name__)
|
||||||
|
|
||||||
# Display counter shared by pool warmup and _fetch_fresh fallback.
|
|
||||||
# Range :200-:399 avoids low-numbered displays that may be pre-occupied by
|
|
||||||
# the host X server or lingering kernel sockets from previous runs.
|
|
||||||
_pool_display_counter = itertools.cycle(range(200, 400))
|
_pool_display_counter = itertools.cycle(range(200, 400))
|
||||||
|
|
||||||
_IDLE_TIMEOUT_SECS = 300 # 5 minutes
|
|
||||||
_CLEANUP_INTERVAL_SECS = 60
|
|
||||||
_QUEUE_TIMEOUT_SECS = 3.0
|
|
||||||
|
|
||||||
_CHROMIUM_ARGS = ["--no-sandbox", "--disable-dev-shm-usage"]
|
_CHROMIUM_ARGS = ["--no-sandbox", "--disable-dev-shm-usage"]
|
||||||
_XVFB_ARGS = ["-screen", "0", "1280x800x24", "-ac"] # -ac: disable X auth (safe in isolated Docker)
|
_XVFB_ARGS = ["-screen", "0", "1280x800x24", "-ac"]
|
||||||
_USER_AGENT = (
|
_USER_AGENT = (
|
||||||
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 "
|
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 "
|
||||||
"(KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36"
|
"(KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36"
|
||||||
)
|
)
|
||||||
_VIEWPORT = {"width": 1280, "height": 800}
|
_VIEWPORT = {"width": 1280, "height": 800}
|
||||||
|
|
||||||
|
# Thread-local storage: each thread gets its own _PooledBrowser slot.
|
||||||
|
_thread_local = threading.local()
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
class _PooledBrowser:
|
class _PooledBrowser:
|
||||||
"""One slot in the browser pool."""
|
"""One browser slot, bound to a single thread."""
|
||||||
xvfb: subprocess.Popen
|
xvfb: subprocess.Popen
|
||||||
pw: object # playwright instance (sync_playwright().__enter__())
|
pw: object # playwright instance (sync_playwright().__enter__())
|
||||||
browser: object # playwright Browser
|
browser: object # playwright Browser
|
||||||
|
|
@ -63,13 +61,13 @@ class _PooledBrowser:
|
||||||
last_used_ts: float = field(default_factory=time.time)
|
last_used_ts: float = field(default_factory=time.time)
|
||||||
|
|
||||||
|
|
||||||
def _launch_slot() -> "_PooledBrowser":
|
def _launch_slot() -> _PooledBrowser:
|
||||||
"""Launch a new Xvfb display + headed Chromium browser + fresh context.
|
"""Launch a new Xvfb display + headed Chromium browser + fresh context.
|
||||||
|
|
||||||
Raises on failure — callers must catch and handle gracefully.
|
Must be called from the thread that will use the slot.
|
||||||
"""
|
"""
|
||||||
from playwright.sync_api import sync_playwright
|
from playwright.sync_api import sync_playwright
|
||||||
from playwright_stealth import Stealth # noqa: F401 — imported here to confirm availability
|
from playwright_stealth import Stealth # noqa: F401
|
||||||
|
|
||||||
display_num = next(_pool_display_counter)
|
display_num = next(_pool_display_counter)
|
||||||
display = f":{display_num}"
|
display = f":{display_num}"
|
||||||
|
|
@ -81,7 +79,6 @@ def _launch_slot() -> "_PooledBrowser":
|
||||||
stdout=subprocess.DEVNULL,
|
stdout=subprocess.DEVNULL,
|
||||||
stderr=subprocess.DEVNULL,
|
stderr=subprocess.DEVNULL,
|
||||||
)
|
)
|
||||||
# Small grace period for Xvfb to bind the display socket.
|
|
||||||
time.sleep(0.3)
|
time.sleep(0.3)
|
||||||
|
|
||||||
pw = sync_playwright().start()
|
pw = sync_playwright().start()
|
||||||
|
|
@ -112,7 +109,7 @@ def _launch_slot() -> "_PooledBrowser":
|
||||||
|
|
||||||
|
|
||||||
def _close_slot(slot: _PooledBrowser) -> None:
|
def _close_slot(slot: _PooledBrowser) -> None:
|
||||||
"""Cleanly close a pool slot: context → browser → Playwright → Xvfb."""
|
"""Cleanly close a slot: context → browser → Playwright → Xvfb."""
|
||||||
try:
|
try:
|
||||||
slot.ctx.close()
|
slot.ctx.close()
|
||||||
except Exception:
|
except Exception:
|
||||||
|
|
@ -133,11 +130,7 @@ def _close_slot(slot: _PooledBrowser) -> None:
|
||||||
|
|
||||||
|
|
||||||
def _replenish_slot(slot: _PooledBrowser) -> _PooledBrowser:
|
def _replenish_slot(slot: _PooledBrowser) -> _PooledBrowser:
|
||||||
"""Close the used context and open a fresh one on the same browser.
|
"""Close the used context and open a fresh one on the same browser."""
|
||||||
|
|
||||||
Returns a new _PooledBrowser sharing the same xvfb/pw/browser but with a
|
|
||||||
clean context — avoids paying browser launch overhead on every fetch.
|
|
||||||
"""
|
|
||||||
try:
|
try:
|
||||||
slot.ctx.close()
|
slot.ctx.close()
|
||||||
except Exception:
|
except Exception:
|
||||||
|
|
@ -158,26 +151,27 @@ def _replenish_slot(slot: _PooledBrowser) -> _PooledBrowser:
|
||||||
|
|
||||||
|
|
||||||
class BrowserPool:
|
class BrowserPool:
|
||||||
"""Thread-safe pool of pre-warmed Playwright browser contexts."""
|
"""Thread-local Playwright browser manager.
|
||||||
|
|
||||||
|
Each thread that calls fetch_html() owns its own browser instance.
|
||||||
|
No slots are shared between threads.
|
||||||
|
"""
|
||||||
|
|
||||||
def __init__(self, size: int = 2) -> None:
|
def __init__(self, size: int = 2) -> None:
|
||||||
self._size = size
|
self._size = size
|
||||||
self._q: queue.Queue[_PooledBrowser] = queue.Queue()
|
|
||||||
self._lock = threading.Lock()
|
self._lock = threading.Lock()
|
||||||
self._started = False
|
self._started = False
|
||||||
self._stopped = False
|
self._stopped = False
|
||||||
self._playwright_available: Optional[bool] = None # cached after first check
|
self._playwright_available: Optional[bool] = None
|
||||||
|
# Registry of all active slots keyed by thread id — used only by stop().
|
||||||
|
self._slot_registry: dict[int, _PooledBrowser] = {}
|
||||||
|
|
||||||
# ------------------------------------------------------------------
|
# ------------------------------------------------------------------
|
||||||
# Lifecycle
|
# Lifecycle
|
||||||
# ------------------------------------------------------------------
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
def start(self) -> None:
|
def start(self) -> None:
|
||||||
"""Pre-warm N browser slots in background threads.
|
"""Mark the pool as started. Slots are created lazily per thread."""
|
||||||
|
|
||||||
Non-blocking: returns immediately; slots appear in the queue as they
|
|
||||||
finish launching. Safe to call multiple times (no-op after first).
|
|
||||||
"""
|
|
||||||
with self._lock:
|
with self._lock:
|
||||||
if self._started:
|
if self._started:
|
||||||
return
|
return
|
||||||
|
|
@ -190,43 +184,19 @@ class BrowserPool:
|
||||||
)
|
)
|
||||||
return
|
return
|
||||||
|
|
||||||
def _warm_one(_: int) -> None:
|
log.info("BrowserPool: started (thread-local mode, size hint=%d)", self._size)
|
||||||
try:
|
|
||||||
slot = _launch_slot()
|
|
||||||
self._q.put(slot)
|
|
||||||
log.debug("BrowserPool: slot :%d ready", slot.display_num)
|
|
||||||
except Exception as exc:
|
|
||||||
log.warning("BrowserPool: pre-warm failed: %s", exc)
|
|
||||||
|
|
||||||
with ThreadPoolExecutor(max_workers=self._size) as ex:
|
|
||||||
futures = [ex.submit(_warm_one, i) for i in range(self._size)]
|
|
||||||
# Don't wait — executor exits after submitting, threads continue.
|
|
||||||
# Actually ThreadPoolExecutor.__exit__ waits for completion, which
|
|
||||||
# is fine: pre-warming completes in background relative to FastAPI
|
|
||||||
# startup because this whole method is called from a thread.
|
|
||||||
for f in as_completed(futures):
|
|
||||||
pass # propagate exceptions via logging, not raises
|
|
||||||
|
|
||||||
_idle_cleaner = threading.Thread(
|
|
||||||
target=self._idle_cleanup_loop, daemon=True, name="browser-pool-idle-cleaner"
|
|
||||||
)
|
|
||||||
_idle_cleaner.start()
|
|
||||||
log.info("BrowserPool: started with %d slots", self._q.qsize())
|
|
||||||
|
|
||||||
def stop(self) -> None:
|
def stop(self) -> None:
|
||||||
"""Drain and close all pool slots. Called at FastAPI shutdown."""
|
"""Close all active slots across all threads."""
|
||||||
with self._lock:
|
with self._lock:
|
||||||
self._stopped = True
|
self._stopped = True
|
||||||
|
registry_snapshot = dict(self._slot_registry)
|
||||||
|
|
||||||
closed = 0
|
closed = 0
|
||||||
while True:
|
for slot in registry_snapshot.values():
|
||||||
try:
|
|
||||||
slot = self._q.get_nowait()
|
|
||||||
_close_slot(slot)
|
_close_slot(slot)
|
||||||
closed += 1
|
closed += 1
|
||||||
except queue.Empty:
|
self._slot_registry.clear()
|
||||||
break
|
|
||||||
|
|
||||||
log.info("BrowserPool: stopped, closed %d slot(s)", closed)
|
log.info("BrowserPool: stopped, closed %d slot(s)", closed)
|
||||||
|
|
||||||
# ------------------------------------------------------------------
|
# ------------------------------------------------------------------
|
||||||
|
|
@ -242,28 +212,13 @@ class BrowserPool:
|
||||||
) -> str:
|
) -> str:
|
||||||
"""Navigate to *url* and return the rendered HTML.
|
"""Navigate to *url* and return the rendered HTML.
|
||||||
|
|
||||||
Borrows a browser context from the pool (blocks up to 3s), uses it to
|
Uses the calling thread's browser slot (creates one if needed).
|
||||||
fetch the page, then replenishes the slot with a fresh context.
|
Falls back to a fresh browser if Playwright is unavailable or the
|
||||||
|
slot fails.
|
||||||
Falls back to a fully fresh browser if the pool is empty after the
|
|
||||||
timeout or if Playwright is unavailable.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
wait_for_selector: CSS/data-testid selector to wait for before capturing
|
|
||||||
HTML (e.g. ``"[data-testid='SearchResults']"``). When set, the fixed
|
|
||||||
*wait_for_timeout_ms* sleep is skipped — the page is captured as soon
|
|
||||||
as the selector appears (or after 15s timeout, whichever comes first).
|
|
||||||
wait_for_timeout_ms: static post-navigation sleep in ms when
|
|
||||||
*wait_for_selector* is None. Default 2000; set higher (e.g. 8000)
|
|
||||||
for sites with JS challenge pages (Cloudflare Turnstile).
|
|
||||||
"""
|
"""
|
||||||
time.sleep(delay)
|
time.sleep(delay)
|
||||||
|
|
||||||
slot: Optional[_PooledBrowser] = None
|
slot = self._get_or_create_thread_slot()
|
||||||
try:
|
|
||||||
slot = self._q.get(timeout=_QUEUE_TIMEOUT_SECS)
|
|
||||||
except queue.Empty:
|
|
||||||
log.debug("BrowserPool: pool empty after %.1fs — using fresh browser", _QUEUE_TIMEOUT_SECS)
|
|
||||||
|
|
||||||
if slot is not None:
|
if slot is not None:
|
||||||
try:
|
try:
|
||||||
|
|
@ -272,32 +227,65 @@ class BrowserPool:
|
||||||
wait_for_selector=wait_for_selector,
|
wait_for_selector=wait_for_selector,
|
||||||
wait_for_timeout_ms=wait_for_timeout_ms,
|
wait_for_timeout_ms=wait_for_timeout_ms,
|
||||||
)
|
)
|
||||||
# Replenish: close dirty context, open fresh one, return to queue.
|
|
||||||
try:
|
try:
|
||||||
fresh_slot = _replenish_slot(slot)
|
fresh_slot = _replenish_slot(slot)
|
||||||
self._q.put(fresh_slot)
|
self._register_slot(fresh_slot)
|
||||||
except Exception as exc:
|
except Exception as exc:
|
||||||
log.warning("BrowserPool: replenish failed, slot discarded: %s", exc)
|
log.warning("BrowserPool: replenish failed, slot discarded: %s", exc)
|
||||||
_close_slot(slot)
|
_close_slot(slot)
|
||||||
|
self._unregister_slot()
|
||||||
return html
|
return html
|
||||||
except Exception as exc:
|
except Exception as exc:
|
||||||
log.warning("BrowserPool: pooled fetch failed (%s) — closing slot", exc)
|
log.warning("BrowserPool: pooled fetch failed (%s) — closing slot", exc)
|
||||||
_close_slot(slot)
|
_close_slot(slot)
|
||||||
# Fall through to fresh browser below.
|
self._unregister_slot()
|
||||||
|
|
||||||
# Fallback: fresh browser (same code as old scraper._fetch_url).
|
|
||||||
return self._fetch_fresh(
|
return self._fetch_fresh(
|
||||||
url,
|
url,
|
||||||
wait_for_selector=wait_for_selector,
|
wait_for_selector=wait_for_selector,
|
||||||
wait_for_timeout_ms=wait_for_timeout_ms,
|
wait_for_timeout_ms=wait_for_timeout_ms,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Thread-local slot management
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def _get_or_create_thread_slot(self) -> Optional[_PooledBrowser]:
|
||||||
|
"""Return the calling thread's slot, creating it if absent."""
|
||||||
|
if not self._check_playwright():
|
||||||
|
return None
|
||||||
|
|
||||||
|
slot: Optional[_PooledBrowser] = getattr(_thread_local, "slot", None)
|
||||||
|
if slot is not None:
|
||||||
|
return slot
|
||||||
|
|
||||||
|
try:
|
||||||
|
slot = _launch_slot()
|
||||||
|
self._register_slot(slot)
|
||||||
|
log.debug("BrowserPool: launched slot :%d for thread %d",
|
||||||
|
slot.display_num, threading.get_ident())
|
||||||
|
return slot
|
||||||
|
except Exception as exc:
|
||||||
|
log.warning("BrowserPool: slot launch failed: %s", exc)
|
||||||
|
return None
|
||||||
|
|
||||||
|
def _register_slot(self, slot: _PooledBrowser) -> None:
|
||||||
|
"""Bind slot to the calling thread (both thread-local and registry)."""
|
||||||
|
_thread_local.slot = slot
|
||||||
|
with self._lock:
|
||||||
|
self._slot_registry[threading.get_ident()] = slot
|
||||||
|
|
||||||
|
def _unregister_slot(self) -> None:
|
||||||
|
"""Remove the calling thread's slot from thread-local and registry."""
|
||||||
|
_thread_local.slot = None
|
||||||
|
with self._lock:
|
||||||
|
self._slot_registry.pop(threading.get_ident(), None)
|
||||||
|
|
||||||
# ------------------------------------------------------------------
|
# ------------------------------------------------------------------
|
||||||
# Internal helpers
|
# Internal helpers
|
||||||
# ------------------------------------------------------------------
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
def _check_playwright(self) -> bool:
|
def _check_playwright(self) -> bool:
|
||||||
"""Return True if Playwright and Xvfb are importable/runnable."""
|
|
||||||
if self._playwright_available is not None:
|
if self._playwright_available is not None:
|
||||||
return self._playwright_available
|
return self._playwright_available
|
||||||
try:
|
try:
|
||||||
|
|
@ -315,7 +303,6 @@ class BrowserPool:
|
||||||
wait_for_selector: Optional[str] = None,
|
wait_for_selector: Optional[str] = None,
|
||||||
wait_for_timeout_ms: int = 2000,
|
wait_for_timeout_ms: int = 2000,
|
||||||
) -> str:
|
) -> str:
|
||||||
"""Open a new page on *slot.ctx*, navigate to *url*, return HTML."""
|
|
||||||
from playwright_stealth import Stealth
|
from playwright_stealth import Stealth
|
||||||
|
|
||||||
page = slot.ctx.new_page()
|
page = slot.ctx.new_page()
|
||||||
|
|
@ -326,7 +313,7 @@ class BrowserPool:
|
||||||
try:
|
try:
|
||||||
page.wait_for_selector(wait_for_selector, timeout=15_000)
|
page.wait_for_selector(wait_for_selector, timeout=15_000)
|
||||||
except Exception:
|
except Exception:
|
||||||
pass # selector didn't appear; return whatever loaded
|
pass
|
||||||
else:
|
else:
|
||||||
page.wait_for_timeout(wait_for_timeout_ms)
|
page.wait_for_timeout(wait_for_timeout_ms)
|
||||||
return page.content()
|
return page.content()
|
||||||
|
|
@ -342,7 +329,6 @@ class BrowserPool:
|
||||||
wait_for_selector: Optional[str] = None,
|
wait_for_selector: Optional[str] = None,
|
||||||
wait_for_timeout_ms: int = 2000,
|
wait_for_timeout_ms: int = 2000,
|
||||||
) -> str:
|
) -> str:
|
||||||
"""Launch a fully fresh browser, fetch *url*, close everything."""
|
|
||||||
import subprocess as _subprocess
|
import subprocess as _subprocess
|
||||||
|
|
||||||
try:
|
try:
|
||||||
|
|
@ -364,7 +350,7 @@ class BrowserPool:
|
||||||
stdout=_subprocess.DEVNULL,
|
stdout=_subprocess.DEVNULL,
|
||||||
stderr=_subprocess.DEVNULL,
|
stderr=_subprocess.DEVNULL,
|
||||||
)
|
)
|
||||||
time.sleep(0.3) # wait for Xvfb to bind the display socket before Chromium starts
|
time.sleep(0.3)
|
||||||
try:
|
try:
|
||||||
with sync_playwright() as pw:
|
with sync_playwright() as pw:
|
||||||
browser = pw.chromium.launch(
|
browser = pw.chromium.launch(
|
||||||
|
|
@ -383,7 +369,7 @@ class BrowserPool:
|
||||||
try:
|
try:
|
||||||
page.wait_for_selector(wait_for_selector, timeout=15_000)
|
page.wait_for_selector(wait_for_selector, timeout=15_000)
|
||||||
except Exception:
|
except Exception:
|
||||||
pass # selector didn't appear; return whatever loaded
|
pass
|
||||||
else:
|
else:
|
||||||
page.wait_for_timeout(wait_for_timeout_ms)
|
page.wait_for_timeout(wait_for_timeout_ms)
|
||||||
html = page.content()
|
html = page.content()
|
||||||
|
|
@ -394,32 +380,6 @@ class BrowserPool:
|
||||||
|
|
||||||
return html
|
return html
|
||||||
|
|
||||||
def _idle_cleanup_loop(self) -> None:
|
|
||||||
"""Daemon thread: drain slots idle for >5 minutes every 60 seconds."""
|
|
||||||
while not self._stopped:
|
|
||||||
time.sleep(_CLEANUP_INTERVAL_SECS)
|
|
||||||
if self._stopped:
|
|
||||||
break
|
|
||||||
now = time.time()
|
|
||||||
idle_cutoff = now - _IDLE_TIMEOUT_SECS
|
|
||||||
# Drain the entire queue, keep non-idle slots, close idle ones.
|
|
||||||
kept: list[_PooledBrowser] = []
|
|
||||||
closed = 0
|
|
||||||
while True:
|
|
||||||
try:
|
|
||||||
slot = self._q.get_nowait()
|
|
||||||
except queue.Empty:
|
|
||||||
break
|
|
||||||
if slot.last_used_ts < idle_cutoff:
|
|
||||||
_close_slot(slot)
|
|
||||||
closed += 1
|
|
||||||
else:
|
|
||||||
kept.append(slot)
|
|
||||||
for slot in kept:
|
|
||||||
self._q.put(slot)
|
|
||||||
if closed:
|
|
||||||
log.info("BrowserPool: idle cleanup closed %d slot(s)", closed)
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# Module-level singleton
|
# Module-level singleton
|
||||||
|
|
@ -430,11 +390,7 @@ _pool_lock = threading.Lock()
|
||||||
|
|
||||||
|
|
||||||
def get_pool() -> BrowserPool:
|
def get_pool() -> BrowserPool:
|
||||||
"""Return the module-level BrowserPool singleton (creates it if needed).
|
"""Return the module-level BrowserPool singleton (creates it if needed)."""
|
||||||
|
|
||||||
Pool size is read from ``BROWSER_POOL_SIZE`` env var (default: 2).
|
|
||||||
Call ``get_pool().start()`` at FastAPI startup to pre-warm slots.
|
|
||||||
"""
|
|
||||||
global _pool
|
global _pool
|
||||||
if _pool is None:
|
if _pool is None:
|
||||||
with _pool_lock:
|
with _pool_lock:
|
||||||
|
|
|
||||||
|
|
@ -126,7 +126,12 @@ class Aggregator:
|
||||||
# Hard filters
|
# Hard filters
|
||||||
if seller and seller.account_age_days is not None and seller.account_age_days < HARD_FILTER_AGE_DAYS:
|
if seller and seller.account_age_days is not None and seller.account_age_days < HARD_FILTER_AGE_DAYS:
|
||||||
red_flags.append("new_account")
|
red_flags.append("new_account")
|
||||||
if seller and seller.feedback_ratio < HARD_FILTER_BAD_RATIO_THRESHOLD:
|
if seller and seller.feedback_ratio == 0.0 and seller.feedback_count > 0:
|
||||||
|
# 12-month ratio missing from page — returning seller or buyer-only account.
|
||||||
|
# Score will be partial (metadata._feedback_ratio returns None). Soft flag
|
||||||
|
# only: do NOT fire established_bad_actor on what is likely missing data.
|
||||||
|
red_flags.append("no_recent_seller_data")
|
||||||
|
elif seller and seller.feedback_ratio < HARD_FILTER_BAD_RATIO_THRESHOLD:
|
||||||
if HARD_FILTER_BAD_RATIO_MIN_COUNT < seller.feedback_count <= HARD_FILTER_BAD_RATIO_MAX_COUNT:
|
if HARD_FILTER_BAD_RATIO_MIN_COUNT < seller.feedback_count <= HARD_FILTER_BAD_RATIO_MAX_COUNT:
|
||||||
# Moderate-volume account with consistently bad ratio → hard flag.
|
# Moderate-volume account with consistently bad ratio → hard flag.
|
||||||
red_flags.append("established_bad_actor")
|
red_flags.append("established_bad_actor")
|
||||||
|
|
|
||||||
|
|
@ -44,7 +44,13 @@ class MetadataScorer:
|
||||||
if count < 200: return 15
|
if count < 200: return 15
|
||||||
return 20
|
return 20
|
||||||
|
|
||||||
def _feedback_ratio(self, ratio: float, count: int) -> int:
|
def _feedback_ratio(self, ratio: float, count: int) -> Optional[int]:
|
||||||
|
# ratio=0.0 with count>0 means the 12-month percentage wasn't on the page —
|
||||||
|
# eBay omits the ratio for returning/buyer-only sellers with no recent sales.
|
||||||
|
# Treat as missing rather than "literally 0% positive" (which eBay doesn't allow
|
||||||
|
# on active accounts — those get suspended long before reaching 0%).
|
||||||
|
if ratio == 0.0 and count > 0:
|
||||||
|
return None
|
||||||
if ratio < 0.80 and count > 20: return 0
|
if ratio < 0.80 and count > 20: return 0
|
||||||
if ratio < 0.90: return 5
|
if ratio < 0.90: return 5
|
||||||
if ratio < 0.95: return 10
|
if ratio < 0.95: return 10
|
||||||
|
|
|
||||||
|
|
@ -1,16 +1,15 @@
|
||||||
"""Tests for app.platforms.ebay.browser_pool.
|
"""Tests for app.platforms.ebay.browser_pool (thread-local design).
|
||||||
|
|
||||||
All tests run without real Chromium / Xvfb / Playwright.
|
All tests run without real Chromium / Xvfb / Playwright.
|
||||||
Playwright, Xvfb subprocess calls, and Stealth are mocked throughout.
|
Playwright, Xvfb subprocess calls, and Stealth are mocked throughout.
|
||||||
"""
|
"""
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import queue
|
|
||||||
import subprocess
|
import subprocess
|
||||||
import threading
|
import threading
|
||||||
import time
|
import time
|
||||||
from typing import Any
|
from typing import Any
|
||||||
from unittest.mock import MagicMock, patch, call
|
from unittest.mock import MagicMock, patch
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
|
|
@ -19,40 +18,35 @@ import pytest
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
def _reset_pool_singleton():
|
def _reset_pool_singleton():
|
||||||
"""Force the module-level _pool singleton back to None."""
|
|
||||||
import app.platforms.ebay.browser_pool as _mod
|
import app.platforms.ebay.browser_pool as _mod
|
||||||
_mod._pool = None
|
_mod._pool = None
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
def _reset_thread_local():
|
||||||
# Fixtures
|
import app.platforms.ebay.browser_pool as _mod
|
||||||
# ---------------------------------------------------------------------------
|
_mod._thread_local.slot = None
|
||||||
|
|
||||||
|
|
||||||
@pytest.fixture(autouse=True)
|
@pytest.fixture(autouse=True)
|
||||||
def reset_singleton():
|
def reset_pool():
|
||||||
"""Reset the singleton before and after every test."""
|
|
||||||
_reset_pool_singleton()
|
_reset_pool_singleton()
|
||||||
|
_reset_thread_local()
|
||||||
yield
|
yield
|
||||||
_reset_pool_singleton()
|
_reset_pool_singleton()
|
||||||
|
_reset_thread_local()
|
||||||
|
|
||||||
|
|
||||||
def _make_fake_slot():
|
def _make_fake_slot():
|
||||||
"""Build a mock _PooledBrowser with all necessary attributes."""
|
|
||||||
from app.platforms.ebay.browser_pool import _PooledBrowser
|
from app.platforms.ebay.browser_pool import _PooledBrowser
|
||||||
|
|
||||||
xvfb = MagicMock(spec=subprocess.Popen)
|
xvfb = MagicMock(spec=subprocess.Popen)
|
||||||
pw = MagicMock()
|
pw = MagicMock()
|
||||||
browser = MagicMock()
|
browser = MagicMock()
|
||||||
ctx = MagicMock()
|
ctx = MagicMock()
|
||||||
slot = _PooledBrowser(
|
return _PooledBrowser(
|
||||||
xvfb=xvfb,
|
xvfb=xvfb, pw=pw, browser=browser, ctx=ctx,
|
||||||
pw=pw,
|
display_num=100, last_used_ts=time.time(),
|
||||||
browser=browser,
|
|
||||||
ctx=ctx,
|
|
||||||
display_num=100,
|
|
||||||
last_used_ts=time.time(),
|
|
||||||
)
|
)
|
||||||
return slot
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|
@ -62,9 +56,7 @@ def _make_fake_slot():
|
||||||
class TestGetPoolSingleton:
|
class TestGetPoolSingleton:
|
||||||
def test_returns_same_instance(self):
|
def test_returns_same_instance(self):
|
||||||
from app.platforms.ebay.browser_pool import get_pool, BrowserPool
|
from app.platforms.ebay.browser_pool import get_pool, BrowserPool
|
||||||
p1 = get_pool()
|
assert get_pool() is get_pool()
|
||||||
p2 = get_pool()
|
|
||||||
assert p1 is p2
|
|
||||||
|
|
||||||
def test_returns_browser_pool_instance(self):
|
def test_returns_browser_pool_instance(self):
|
||||||
from app.platforms.ebay.browser_pool import get_pool, BrowserPool
|
from app.platforms.ebay.browser_pool import get_pool, BrowserPool
|
||||||
|
|
@ -72,14 +64,12 @@ class TestGetPoolSingleton:
|
||||||
|
|
||||||
def test_default_size_is_two(self):
|
def test_default_size_is_two(self):
|
||||||
from app.platforms.ebay.browser_pool import get_pool
|
from app.platforms.ebay.browser_pool import get_pool
|
||||||
pool = get_pool()
|
assert get_pool()._size == 2
|
||||||
assert pool._size == 2
|
|
||||||
|
|
||||||
def test_custom_size_from_env(self, monkeypatch):
|
def test_custom_size_from_env(self, monkeypatch):
|
||||||
monkeypatch.setenv("BROWSER_POOL_SIZE", "5")
|
monkeypatch.setenv("BROWSER_POOL_SIZE", "5")
|
||||||
from app.platforms.ebay.browser_pool import get_pool
|
from app.platforms.ebay.browser_pool import get_pool
|
||||||
pool = get_pool()
|
assert get_pool()._size == 5
|
||||||
assert pool._size == 5
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|
@ -88,17 +78,15 @@ class TestGetPoolSingleton:
|
||||||
|
|
||||||
class TestLifecycle:
|
class TestLifecycle:
|
||||||
def test_start_is_noop_when_playwright_unavailable(self):
|
def test_start_is_noop_when_playwright_unavailable(self):
|
||||||
"""Pool should handle missing Playwright gracefully — no error raised."""
|
|
||||||
from app.platforms.ebay.browser_pool import BrowserPool
|
from app.platforms.ebay.browser_pool import BrowserPool
|
||||||
|
|
||||||
pool = BrowserPool(size=2)
|
pool = BrowserPool(size=2)
|
||||||
with patch.object(pool, "_check_playwright", return_value=False):
|
with patch.object(pool, "_check_playwright", return_value=False):
|
||||||
pool.start() # must not raise
|
pool.start()
|
||||||
# Pool queue is empty — no slots launched.
|
assert pool._started is True
|
||||||
assert pool._q.empty()
|
assert pool._slot_registry == {}
|
||||||
|
|
||||||
def test_start_only_runs_once(self):
|
def test_start_only_runs_once(self):
|
||||||
"""Calling start() twice must not double-warm."""
|
|
||||||
from app.platforms.ebay.browser_pool import BrowserPool
|
from app.platforms.ebay.browser_pool import BrowserPool
|
||||||
|
|
||||||
pool = BrowserPool(size=1)
|
pool = BrowserPool(size=1)
|
||||||
|
|
@ -107,47 +95,46 @@ class TestLifecycle:
|
||||||
pool.start()
|
pool.start()
|
||||||
assert pool._started is True
|
assert pool._started is True
|
||||||
|
|
||||||
def test_stop_drains_queue(self):
|
def test_stop_closes_all_registry_slots(self):
|
||||||
"""stop() should close every slot in the queue."""
|
|
||||||
from app.platforms.ebay.browser_pool import BrowserPool
|
from app.platforms.ebay.browser_pool import BrowserPool
|
||||||
|
|
||||||
pool = BrowserPool(size=2)
|
pool = BrowserPool(size=2)
|
||||||
slot1 = _make_fake_slot()
|
slot1 = _make_fake_slot()
|
||||||
slot2 = _make_fake_slot()
|
slot2 = _make_fake_slot()
|
||||||
pool._q.put(slot1)
|
pool._slot_registry[1001] = slot1
|
||||||
pool._q.put(slot2)
|
pool._slot_registry[1002] = slot2
|
||||||
|
|
||||||
with patch("app.platforms.ebay.browser_pool._close_slot") as mock_close:
|
with patch("app.platforms.ebay.browser_pool._close_slot") as mock_close:
|
||||||
pool.stop()
|
pool.stop()
|
||||||
|
|
||||||
assert mock_close.call_count == 2
|
assert mock_close.call_count == 2
|
||||||
assert pool._q.empty()
|
assert pool._slot_registry == {}
|
||||||
assert pool._stopped is True
|
assert pool._stopped is True
|
||||||
|
|
||||||
def test_stop_on_empty_pool_is_safe(self):
|
def test_stop_on_empty_registry_is_safe(self):
|
||||||
from app.platforms.ebay.browser_pool import BrowserPool
|
from app.platforms.ebay.browser_pool import BrowserPool
|
||||||
pool = BrowserPool(size=2)
|
BrowserPool(size=2).stop()
|
||||||
pool.stop() # must not raise
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# fetch_html — pool hit path
|
# fetch_html — thread-local slot hit path
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
class TestFetchHtmlPoolHit:
|
class TestFetchHtmlSlotHit:
|
||||||
def test_uses_pooled_slot_and_replenishes(self):
|
def test_uses_existing_slot_and_replenishes(self):
|
||||||
"""fetch_html should borrow a slot, call _fetch_with_slot, replenish."""
|
|
||||||
from app.platforms.ebay.browser_pool import BrowserPool
|
from app.platforms.ebay.browser_pool import BrowserPool
|
||||||
|
import app.platforms.ebay.browser_pool as _mod
|
||||||
|
|
||||||
pool = BrowserPool(size=1)
|
pool = BrowserPool(size=1)
|
||||||
slot = _make_fake_slot()
|
slot = _make_fake_slot()
|
||||||
pool._q.put(slot)
|
_mod._thread_local.slot = slot
|
||||||
|
|
||||||
fresh_slot = _make_fake_slot()
|
fresh_slot = _make_fake_slot()
|
||||||
|
|
||||||
with (
|
with (
|
||||||
patch.object(pool, "_fetch_with_slot", return_value="<html>ok</html>") as mock_fetch,
|
patch.object(pool, "_fetch_with_slot", return_value="<html>ok</html>") as mock_fetch,
|
||||||
patch("app.platforms.ebay.browser_pool._replenish_slot", return_value=fresh_slot) as mock_replenish,
|
patch("app.platforms.ebay.browser_pool._replenish_slot", return_value=fresh_slot),
|
||||||
|
patch.object(pool, "_register_slot") as mock_register,
|
||||||
patch("time.sleep"),
|
patch("time.sleep"),
|
||||||
):
|
):
|
||||||
html = pool.fetch_html("https://www.ebay.com/sch/i.html?_nkw=test", delay=0)
|
html = pool.fetch_html("https://www.ebay.com/sch/i.html?_nkw=test", delay=0)
|
||||||
|
|
@ -157,21 +144,19 @@ class TestFetchHtmlPoolHit:
|
||||||
slot, "https://www.ebay.com/sch/i.html?_nkw=test",
|
slot, "https://www.ebay.com/sch/i.html?_nkw=test",
|
||||||
wait_for_selector=None, wait_for_timeout_ms=2000,
|
wait_for_selector=None, wait_for_timeout_ms=2000,
|
||||||
)
|
)
|
||||||
mock_replenish.assert_called_once_with(slot)
|
mock_register.assert_called_once_with(fresh_slot)
|
||||||
# Fresh slot returned to queue
|
|
||||||
assert pool._q.get_nowait() is fresh_slot
|
|
||||||
|
|
||||||
def test_delay_is_respected(self):
|
def test_delay_is_respected(self):
|
||||||
"""fetch_html must call time.sleep(delay)."""
|
|
||||||
from app.platforms.ebay.browser_pool import BrowserPool
|
from app.platforms.ebay.browser_pool import BrowserPool
|
||||||
|
import app.platforms.ebay.browser_pool as _mod
|
||||||
|
|
||||||
pool = BrowserPool(size=1)
|
pool = BrowserPool(size=1)
|
||||||
slot = _make_fake_slot()
|
_mod._thread_local.slot = _make_fake_slot()
|
||||||
pool._q.put(slot)
|
|
||||||
|
|
||||||
with (
|
with (
|
||||||
patch.object(pool, "_fetch_with_slot", return_value="<html/>"),
|
patch.object(pool, "_fetch_with_slot", return_value="<html/>"),
|
||||||
patch("app.platforms.ebay.browser_pool._replenish_slot", return_value=_make_fake_slot()),
|
patch("app.platforms.ebay.browser_pool._replenish_slot", return_value=_make_fake_slot()),
|
||||||
|
patch.object(pool, "_register_slot"),
|
||||||
patch("app.platforms.ebay.browser_pool.time") as mock_time,
|
patch("app.platforms.ebay.browser_pool.time") as mock_time,
|
||||||
):
|
):
|
||||||
pool.fetch_html("https://example.com", delay=1.5)
|
pool.fetch_html("https://example.com", delay=1.5)
|
||||||
|
|
@ -180,22 +165,19 @@ class TestFetchHtmlPoolHit:
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# fetch_html — pool empty / fallback path
|
# fetch_html — no slot / fallback path
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
class TestFetchHtmlFallback:
|
class TestFetchHtmlFallback:
|
||||||
def test_falls_back_to_fresh_browser_when_pool_empty(self):
|
def test_falls_back_when_no_slot_and_playwright_unavailable(self):
|
||||||
"""When pool is empty after timeout, _fetch_fresh should be called."""
|
|
||||||
from app.platforms.ebay.browser_pool import BrowserPool
|
from app.platforms.ebay.browser_pool import BrowserPool
|
||||||
|
|
||||||
pool = BrowserPool(size=1)
|
pool = BrowserPool(size=1)
|
||||||
# Queue is empty — no slots available.
|
# No thread-local slot; playwright unavailable → _get_or_create returns None.
|
||||||
|
|
||||||
with (
|
with (
|
||||||
|
patch.object(pool, "_get_or_create_thread_slot", return_value=None),
|
||||||
patch.object(pool, "_fetch_fresh", return_value="<html>fresh</html>") as mock_fresh,
|
patch.object(pool, "_fetch_fresh", return_value="<html>fresh</html>") as mock_fresh,
|
||||||
patch("time.sleep"),
|
patch("time.sleep"),
|
||||||
# Make Queue.get raise Empty after a short wait.
|
|
||||||
patch.object(pool._q, "get", side_effect=queue.Empty),
|
|
||||||
):
|
):
|
||||||
html = pool.fetch_html("https://www.ebay.com/sch/i.html?_nkw=widget", delay=0)
|
html = pool.fetch_html("https://www.ebay.com/sch/i.html?_nkw=widget", delay=0)
|
||||||
|
|
||||||
|
|
@ -206,17 +188,18 @@ class TestFetchHtmlFallback:
|
||||||
)
|
)
|
||||||
|
|
||||||
def test_falls_back_when_pooled_fetch_raises(self):
|
def test_falls_back_when_pooled_fetch_raises(self):
|
||||||
"""If _fetch_with_slot raises, the slot is closed and _fetch_fresh is used."""
|
|
||||||
from app.platforms.ebay.browser_pool import BrowserPool
|
from app.platforms.ebay.browser_pool import BrowserPool
|
||||||
|
import app.platforms.ebay.browser_pool as _mod
|
||||||
|
|
||||||
pool = BrowserPool(size=1)
|
pool = BrowserPool(size=1)
|
||||||
slot = _make_fake_slot()
|
slot = _make_fake_slot()
|
||||||
pool._q.put(slot)
|
_mod._thread_local.slot = slot
|
||||||
|
|
||||||
with (
|
with (
|
||||||
patch.object(pool, "_fetch_with_slot", side_effect=RuntimeError("Chromium crashed")),
|
patch.object(pool, "_fetch_with_slot", side_effect=RuntimeError("Chromium crashed")),
|
||||||
patch.object(pool, "_fetch_fresh", return_value="<html>recovered</html>") as mock_fresh,
|
patch.object(pool, "_fetch_fresh", return_value="<html>recovered</html>") as mock_fresh,
|
||||||
patch("app.platforms.ebay.browser_pool._close_slot") as mock_close,
|
patch("app.platforms.ebay.browser_pool._close_slot") as mock_close,
|
||||||
|
patch.object(pool, "_unregister_slot"),
|
||||||
patch("time.sleep"),
|
patch("time.sleep"),
|
||||||
):
|
):
|
||||||
html = pool.fetch_html("https://www.ebay.com/", delay=0)
|
html = pool.fetch_html("https://www.ebay.com/", delay=0)
|
||||||
|
|
@ -226,19 +209,107 @@ class TestFetchHtmlFallback:
|
||||||
mock_fresh.assert_called_once()
|
mock_fresh.assert_called_once()
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Thread-local slot management
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
class TestThreadLocalSlotManagement:
|
||||||
|
def test_get_or_create_returns_existing_slot(self):
|
||||||
|
import app.platforms.ebay.browser_pool as _mod
|
||||||
|
from app.platforms.ebay.browser_pool import BrowserPool
|
||||||
|
|
||||||
|
pool = BrowserPool(size=1)
|
||||||
|
pool._playwright_available = True
|
||||||
|
existing = _make_fake_slot()
|
||||||
|
_mod._thread_local.slot = existing
|
||||||
|
|
||||||
|
result = pool._get_or_create_thread_slot()
|
||||||
|
assert result is existing
|
||||||
|
|
||||||
|
def test_get_or_create_launches_new_slot_when_absent(self):
|
||||||
|
import app.platforms.ebay.browser_pool as _mod
|
||||||
|
from app.platforms.ebay.browser_pool import BrowserPool
|
||||||
|
|
||||||
|
pool = BrowserPool(size=1)
|
||||||
|
pool._playwright_available = True
|
||||||
|
_mod._thread_local.slot = None
|
||||||
|
|
||||||
|
new_slot = _make_fake_slot()
|
||||||
|
with (
|
||||||
|
patch("app.platforms.ebay.browser_pool._launch_slot", return_value=new_slot),
|
||||||
|
patch.object(pool, "_register_slot") as mock_register,
|
||||||
|
):
|
||||||
|
result = pool._get_or_create_thread_slot()
|
||||||
|
|
||||||
|
assert result is new_slot
|
||||||
|
mock_register.assert_called_once_with(new_slot)
|
||||||
|
|
||||||
|
def test_get_or_create_returns_none_when_playwright_unavailable(self):
|
||||||
|
from app.platforms.ebay.browser_pool import BrowserPool
|
||||||
|
|
||||||
|
pool = BrowserPool(size=1)
|
||||||
|
pool._playwright_available = False
|
||||||
|
assert pool._get_or_create_thread_slot() is None
|
||||||
|
|
||||||
|
def test_register_slot_sets_thread_local_and_registry(self):
|
||||||
|
import app.platforms.ebay.browser_pool as _mod
|
||||||
|
from app.platforms.ebay.browser_pool import BrowserPool
|
||||||
|
|
||||||
|
pool = BrowserPool(size=1)
|
||||||
|
slot = _make_fake_slot()
|
||||||
|
pool._register_slot(slot)
|
||||||
|
|
||||||
|
assert _mod._thread_local.slot is slot
|
||||||
|
assert threading.get_ident() in pool._slot_registry
|
||||||
|
|
||||||
|
def test_unregister_slot_clears_thread_local_and_registry(self):
|
||||||
|
import app.platforms.ebay.browser_pool as _mod
|
||||||
|
from app.platforms.ebay.browser_pool import BrowserPool
|
||||||
|
|
||||||
|
pool = BrowserPool(size=1)
|
||||||
|
slot = _make_fake_slot()
|
||||||
|
pool._register_slot(slot)
|
||||||
|
pool._unregister_slot()
|
||||||
|
|
||||||
|
assert getattr(_mod._thread_local, "slot", None) is None
|
||||||
|
assert threading.get_ident() not in pool._slot_registry
|
||||||
|
|
||||||
|
def test_different_threads_get_independent_slots(self):
|
||||||
|
from app.platforms.ebay.browser_pool import BrowserPool
|
||||||
|
|
||||||
|
pool = BrowserPool(size=2)
|
||||||
|
pool._playwright_available = True
|
||||||
|
|
||||||
|
slots_seen: list = []
|
||||||
|
errors: list = []
|
||||||
|
|
||||||
|
def worker():
|
||||||
|
new_slot = _make_fake_slot()
|
||||||
|
with patch("app.platforms.ebay.browser_pool._launch_slot", return_value=new_slot):
|
||||||
|
s = pool._get_or_create_thread_slot()
|
||||||
|
slots_seen.append(s)
|
||||||
|
|
||||||
|
t1 = threading.Thread(target=worker)
|
||||||
|
t2 = threading.Thread(target=worker)
|
||||||
|
t1.start(); t2.start()
|
||||||
|
t1.join(); t2.join()
|
||||||
|
|
||||||
|
assert len(slots_seen) == 2
|
||||||
|
# Each thread got its own slot object (they may differ or coincidentally share
|
||||||
|
# the same mock; what matters is both threads succeeded without interference).
|
||||||
|
assert all(s is not None for s in slots_seen)
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# ImportError graceful fallback
|
# ImportError graceful fallback
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
class TestImportErrorHandling:
|
class TestImportErrorHandling:
|
||||||
def test_check_playwright_returns_false_on_import_error(self):
|
def test_check_playwright_returns_false_on_import_error(self):
|
||||||
"""_check_playwright should cache False when playwright is not installed."""
|
|
||||||
from app.platforms.ebay.browser_pool import BrowserPool
|
from app.platforms.ebay.browser_pool import BrowserPool
|
||||||
|
|
||||||
pool = BrowserPool(size=2)
|
pool = BrowserPool(size=2)
|
||||||
|
|
||||||
with patch.dict("sys.modules", {"playwright": None, "playwright_stealth": None}):
|
with patch.dict("sys.modules", {"playwright": None, "playwright_stealth": None}):
|
||||||
# Force re-check by clearing the cached value.
|
|
||||||
pool._playwright_available = None
|
pool._playwright_available = None
|
||||||
result = pool._check_playwright()
|
result = pool._check_playwright()
|
||||||
|
|
||||||
|
|
@ -246,12 +317,11 @@ class TestImportErrorHandling:
|
||||||
assert pool._playwright_available is False
|
assert pool._playwright_available is False
|
||||||
|
|
||||||
def test_start_logs_warning_when_playwright_missing(self, caplog):
|
def test_start_logs_warning_when_playwright_missing(self, caplog):
|
||||||
"""start() should log a warning and not crash when Playwright is absent."""
|
|
||||||
import logging
|
import logging
|
||||||
from app.platforms.ebay.browser_pool import BrowserPool
|
from app.platforms.ebay.browser_pool import BrowserPool
|
||||||
|
|
||||||
pool = BrowserPool(size=1)
|
pool = BrowserPool(size=1)
|
||||||
pool._playwright_available = False # simulate missing
|
pool._playwright_available = False
|
||||||
|
|
||||||
with patch.object(pool, "_check_playwright", return_value=False):
|
with patch.object(pool, "_check_playwright", return_value=False):
|
||||||
with caplog.at_level(logging.WARNING, logger="app.platforms.ebay.browser_pool"):
|
with caplog.at_level(logging.WARNING, logger="app.platforms.ebay.browser_pool"):
|
||||||
|
|
@ -260,87 +330,14 @@ class TestImportErrorHandling:
|
||||||
assert any("not available" in r.message for r in caplog.records)
|
assert any("not available" in r.message for r in caplog.records)
|
||||||
|
|
||||||
def test_fetch_fresh_raises_runtime_error_when_playwright_missing(self):
|
def test_fetch_fresh_raises_runtime_error_when_playwright_missing(self):
|
||||||
"""_fetch_fresh must raise RuntimeError (not ImportError) when PW absent."""
|
|
||||||
from app.platforms.ebay.browser_pool import BrowserPool
|
from app.platforms.ebay.browser_pool import BrowserPool
|
||||||
|
|
||||||
pool = BrowserPool(size=1)
|
pool = BrowserPool(size=1)
|
||||||
|
|
||||||
with patch.dict("sys.modules", {"playwright": None, "playwright.sync_api": None}):
|
with patch.dict("sys.modules", {"playwright": None, "playwright.sync_api": None}):
|
||||||
with pytest.raises(RuntimeError, match="Playwright not installed"):
|
with pytest.raises(RuntimeError, match="Playwright not installed"):
|
||||||
pool._fetch_fresh("https://www.ebay.com/")
|
pool._fetch_fresh("https://www.ebay.com/")
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Idle cleanup
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
class TestIdleCleanup:
|
|
||||||
def test_idle_cleanup_closes_stale_slots(self):
|
|
||||||
"""_idle_cleanup_loop should close slots whose last_used_ts is too old."""
|
|
||||||
from app.platforms.ebay.browser_pool import BrowserPool, _IDLE_TIMEOUT_SECS
|
|
||||||
|
|
||||||
pool = BrowserPool(size=2)
|
|
||||||
|
|
||||||
stale_slot = _make_fake_slot()
|
|
||||||
stale_slot.last_used_ts = time.time() - (_IDLE_TIMEOUT_SECS + 60)
|
|
||||||
|
|
||||||
fresh_slot = _make_fake_slot()
|
|
||||||
fresh_slot.last_used_ts = time.time()
|
|
||||||
|
|
||||||
pool._q.put(stale_slot)
|
|
||||||
pool._q.put(fresh_slot)
|
|
||||||
|
|
||||||
closed_slots = []
|
|
||||||
|
|
||||||
def fake_close(s):
|
|
||||||
closed_slots.append(s)
|
|
||||||
|
|
||||||
with patch("app.platforms.ebay.browser_pool._close_slot", side_effect=fake_close):
|
|
||||||
# Run one cleanup tick directly (not the full loop).
|
|
||||||
now = time.time()
|
|
||||||
idle_cutoff = now - _IDLE_TIMEOUT_SECS
|
|
||||||
kept = []
|
|
||||||
while True:
|
|
||||||
try:
|
|
||||||
s = pool._q.get_nowait()
|
|
||||||
except queue.Empty:
|
|
||||||
break
|
|
||||||
if s.last_used_ts < idle_cutoff:
|
|
||||||
fake_close(s)
|
|
||||||
else:
|
|
||||||
kept.append(s)
|
|
||||||
for s in kept:
|
|
||||||
pool._q.put(s)
|
|
||||||
|
|
||||||
assert stale_slot in closed_slots
|
|
||||||
assert fresh_slot not in closed_slots
|
|
||||||
assert pool._q.qsize() == 1
|
|
||||||
|
|
||||||
def test_idle_cleanup_loop_stops_when_pool_stopped(self):
|
|
||||||
"""Cleanup daemon should exit when _stopped is True."""
|
|
||||||
from app.platforms.ebay.browser_pool import BrowserPool, _CLEANUP_INTERVAL_SECS
|
|
||||||
|
|
||||||
pool = BrowserPool(size=1)
|
|
||||||
pool._stopped = True
|
|
||||||
|
|
||||||
# The loop should return after one iteration of the while check.
|
|
||||||
# Use a very short sleep mock so the test doesn't actually wait 60s.
|
|
||||||
sleep_calls = []
|
|
||||||
|
|
||||||
def fake_sleep(secs):
|
|
||||||
sleep_calls.append(secs)
|
|
||||||
|
|
||||||
with patch("app.platforms.ebay.browser_pool.time") as mock_time:
|
|
||||||
mock_time.time.return_value = time.time()
|
|
||||||
mock_time.sleep.side_effect = fake_sleep
|
|
||||||
# Run in a thread with a short timeout to confirm it exits.
|
|
||||||
t = threading.Thread(target=pool._idle_cleanup_loop)
|
|
||||||
t.start()
|
|
||||||
t.join(timeout=2.0)
|
|
||||||
|
|
||||||
assert not t.is_alive(), "idle cleanup loop did not exit when _stopped=True"
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# _replenish_slot helper
|
# _replenish_slot helper
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|
@ -355,12 +352,8 @@ class TestReplenishSlot:
|
||||||
browser.new_context.return_value = new_ctx
|
browser.new_context.return_value = new_ctx
|
||||||
|
|
||||||
slot = _PooledBrowser(
|
slot = _PooledBrowser(
|
||||||
xvfb=MagicMock(),
|
xvfb=MagicMock(), pw=MagicMock(), browser=browser,
|
||||||
pw=MagicMock(),
|
ctx=old_ctx, display_num=101, last_used_ts=time.time() - 10,
|
||||||
browser=browser,
|
|
||||||
ctx=old_ctx,
|
|
||||||
display_num=101,
|
|
||||||
last_used_ts=time.time() - 10,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
result = _replenish_slot(slot)
|
result = _replenish_slot(slot)
|
||||||
|
|
@ -370,7 +363,6 @@ class TestReplenishSlot:
|
||||||
assert result.ctx is new_ctx
|
assert result.ctx is new_ctx
|
||||||
assert result.browser is browser
|
assert result.browser is browser
|
||||||
assert result.xvfb is slot.xvfb
|
assert result.xvfb is slot.xvfb
|
||||||
# last_used_ts is refreshed
|
|
||||||
assert result.last_used_ts > slot.last_used_ts
|
assert result.last_used_ts > slot.last_used_ts
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -391,7 +383,6 @@ class TestCloseSlot:
|
||||||
xvfb=xvfb, pw=pw, browser=browser, ctx=ctx,
|
xvfb=xvfb, pw=pw, browser=browser, ctx=ctx,
|
||||||
display_num=102, last_used_ts=time.time(),
|
display_num=102, last_used_ts=time.time(),
|
||||||
)
|
)
|
||||||
|
|
||||||
_close_slot(slot)
|
_close_slot(slot)
|
||||||
|
|
||||||
ctx.close.assert_called_once()
|
ctx.close.assert_called_once()
|
||||||
|
|
@ -401,7 +392,6 @@ class TestCloseSlot:
|
||||||
xvfb.wait.assert_called_once()
|
xvfb.wait.assert_called_once()
|
||||||
|
|
||||||
def test_close_slot_ignores_exceptions(self):
|
def test_close_slot_ignores_exceptions(self):
|
||||||
"""_close_slot must not raise even if components throw."""
|
|
||||||
from app.platforms.ebay.browser_pool import _close_slot, _PooledBrowser
|
from app.platforms.ebay.browser_pool import _close_slot, _PooledBrowser
|
||||||
|
|
||||||
xvfb = MagicMock(spec=subprocess.Popen)
|
xvfb = MagicMock(spec=subprocess.Popen)
|
||||||
|
|
@ -418,7 +408,6 @@ class TestCloseSlot:
|
||||||
xvfb=xvfb, pw=pw, browser=browser, ctx=ctx,
|
xvfb=xvfb, pw=pw, browser=browser, ctx=ctx,
|
||||||
display_num=103, last_used_ts=time.time(),
|
display_num=103, last_used_ts=time.time(),
|
||||||
)
|
)
|
||||||
|
|
||||||
_close_slot(slot) # must not raise
|
_close_slot(slot) # must not raise
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -428,7 +417,6 @@ class TestCloseSlot:
|
||||||
|
|
||||||
class TestScraperUsesPool:
|
class TestScraperUsesPool:
|
||||||
def test_fetch_url_delegates_to_pool(self):
|
def test_fetch_url_delegates_to_pool(self):
|
||||||
"""ScrapedEbayAdapter._fetch_url must use the pool, not launch its own browser."""
|
|
||||||
from app.platforms.ebay.browser_pool import BrowserPool
|
from app.platforms.ebay.browser_pool import BrowserPool
|
||||||
from app.platforms.ebay.scraper import ScrapedEbayAdapter
|
from app.platforms.ebay.scraper import ScrapedEbayAdapter
|
||||||
from app.db.store import Store
|
from app.db.store import Store
|
||||||
|
|
@ -440,7 +428,6 @@ class TestScraperUsesPool:
|
||||||
fake_pool.fetch_html.return_value = "<html>pooled</html>"
|
fake_pool.fetch_html.return_value = "<html>pooled</html>"
|
||||||
|
|
||||||
with patch("app.platforms.ebay.browser_pool.get_pool", return_value=fake_pool):
|
with patch("app.platforms.ebay.browser_pool.get_pool", return_value=fake_pool):
|
||||||
# Clear the cache so fetch_url actually hits the pool.
|
|
||||||
import app.platforms.ebay.scraper as scraper_mod
|
import app.platforms.ebay.scraper as scraper_mod
|
||||||
scraper_mod._html_cache.clear()
|
scraper_mod._html_cache.clear()
|
||||||
html = adapter._fetch_url("https://www.ebay.com/sch/i.html?_nkw=test")
|
html = adapter._fetch_url("https://www.ebay.com/sch/i.html?_nkw=test")
|
||||||
|
|
@ -451,7 +438,6 @@ class TestScraperUsesPool:
|
||||||
)
|
)
|
||||||
|
|
||||||
def test_fetch_url_uses_cache_before_pool(self):
|
def test_fetch_url_uses_cache_before_pool(self):
|
||||||
"""_fetch_url should return cached HTML without hitting the pool."""
|
|
||||||
from app.platforms.ebay.scraper import ScrapedEbayAdapter, _html_cache, _HTML_CACHE_TTL
|
from app.platforms.ebay.scraper import ScrapedEbayAdapter, _html_cache, _HTML_CACHE_TTL
|
||||||
from app.db.store import Store
|
from app.db.store import Store
|
||||||
|
|
||||||
|
|
@ -467,6 +453,4 @@ class TestScraperUsesPool:
|
||||||
|
|
||||||
assert html == "<html>cached</html>"
|
assert html == "<html>cached</html>"
|
||||||
fake_pool.fetch_html.assert_not_called()
|
fake_pool.fetch_html.assert_not_called()
|
||||||
|
|
||||||
# Cleanup
|
|
||||||
_html_cache.pop(url, None)
|
_html_cache.pop(url, None)
|
||||||
|
|
|
||||||
|
|
@ -296,3 +296,37 @@ def test_non_retailer_does_not_suppress_duplicate_photo():
|
||||||
)
|
)
|
||||||
result = agg.aggregate(_ALL_20.copy(), photo_hash_duplicate=True, seller=seller)
|
result = agg.aggregate(_ALL_20.copy(), photo_hash_duplicate=True, seller=seller)
|
||||||
assert "duplicate_photo" in result.red_flags_json
|
assert "duplicate_photo" in result.red_flags_json
|
||||||
|
|
||||||
|
|
||||||
|
# ── #52: buyer-only / returning seller (ratio=0.0, count>0) ──────────────────
|
||||||
|
|
||||||
|
def test_zero_ratio_with_count_gives_no_recent_seller_data_flag():
|
||||||
|
"""Seller with 117 lifetime feedbacks (buyer-only) has ratio=0.0 parsed from page.
|
||||||
|
Must get no_recent_seller_data soft flag, NOT established_bad_actor."""
|
||||||
|
agg = Aggregator()
|
||||||
|
scores = {k: 10 for k in ["account_age", "feedback_count",
|
||||||
|
"feedback_ratio", "price_vs_market", "category_history"]}
|
||||||
|
buyer_only = Seller(
|
||||||
|
platform="ebay", platform_seller_id="u", username="jjcpryz",
|
||||||
|
account_age_days=1200, feedback_count=117, feedback_ratio=0.0,
|
||||||
|
category_history_json="{}",
|
||||||
|
)
|
||||||
|
result = agg.aggregate(scores, photo_hash_duplicate=False, seller=buyer_only)
|
||||||
|
assert "no_recent_seller_data" in result.red_flags_json
|
||||||
|
assert "established_bad_actor" not in result.red_flags_json
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
def test_established_bad_actor_still_fires_for_genuinely_bad_ratio():
|
||||||
|
"""ratio=0.75 (not zero) with moderate count → established_bad_actor still fires."""
|
||||||
|
agg = Aggregator()
|
||||||
|
scores = {k: 10 for k in ["account_age", "feedback_count",
|
||||||
|
"feedback_ratio", "price_vs_market", "category_history"]}
|
||||||
|
bad = Seller(
|
||||||
|
platform="ebay", platform_seller_id="u", username="u",
|
||||||
|
account_age_days=500, feedback_count=100, feedback_ratio=0.75,
|
||||||
|
category_history_json="{}",
|
||||||
|
)
|
||||||
|
result = agg.aggregate(scores, photo_hash_duplicate=False, seller=bad)
|
||||||
|
assert "established_bad_actor" in result.red_flags_json
|
||||||
|
assert "no_recent_seller_data" not in result.red_flags_json
|
||||||
|
|
|
||||||
|
|
@ -43,3 +43,26 @@ def test_no_market_data_returns_none():
|
||||||
scores = scorer.score(_seller(), market_median=None, listing_price=950.0)
|
scores = scorer.score(_seller(), market_median=None, listing_price=950.0)
|
||||||
# None signals "data unavailable" — aggregator will set score_is_partial=True
|
# None signals "data unavailable" — aggregator will set score_is_partial=True
|
||||||
assert scores["price_vs_market"] is None
|
assert scores["price_vs_market"] is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_zero_ratio_with_nonzero_count_returns_none():
|
||||||
|
"""ratio=0.0 with count>0 means eBay didn't show a 12-month percentage.
|
||||||
|
Must return None (missing data) not 0 (catastrophically bad)."""
|
||||||
|
scorer = MetadataScorer()
|
||||||
|
scores = scorer.score(
|
||||||
|
_seller(feedback_ratio=0.0, feedback_count=117),
|
||||||
|
market_median=None, listing_price=500.0,
|
||||||
|
)
|
||||||
|
assert scores["feedback_ratio"] is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_zero_ratio_with_zero_count_scores_low():
|
||||||
|
"""feedback_ratio=0.0 with count=0 is a real 'no data at all' case, not missing."""
|
||||||
|
scorer = MetadataScorer()
|
||||||
|
scores = scorer.score(
|
||||||
|
_seller(feedback_ratio=0.0, feedback_count=0),
|
||||||
|
market_median=None, listing_price=500.0,
|
||||||
|
)
|
||||||
|
# count=0 means zero_feedback; ratio=0 with count=0 is the standard no-history path
|
||||||
|
# (not the "missing 12-month window" path)
|
||||||
|
assert scores["feedback_ratio"] == 5 # ratio < 0.90 → 5
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue