magpie/app/core/config.py
Alan Weinstock 80718e206c feat(#7,#10): signal crawler -- Reddit + Lemmy community monitoring
Implements the full signal detection pipeline:

Backend:
- app/services/lemmy/client.py: async Lemmy API v3 client, community@instance
  addressing, integer cursor dedup, normalised post dicts
- app/services/scraper.py: platform-agnostic scraper; Reddit (.json API,
  fullname cursor) + Lemmy (integer ID cursor); keyword/regex/all match modes,
  min_score gate, NormalizedPost shape, upsert dedup via UNIQUE post_id
- app/api/endpoints/signals.py: CRUD for signal_rules + signals queue;
  POST /signals/scrape manual trigger; scrape-state viewer
- migrations 010-012: signal_rules, signals, signal_scrape_state tables
- scheduler: interval job every 30 min (scraper_enabled=True in config)
- Fixed migration collision: 007_signal_rules.sql → 010, 008 → 011, 009 → 012

Frontend:
- SignalsView.vue: signal feed with status filter (new/saved/dismissed),
  keyword chips, score/comment counts, save/dismiss actions, rules editor panel
- api.ts: SignalRule, Signal types + signalRules/signals API methods
- Nav: Signals as default landing route (replaces /campaigns default)

Closes #7 (signal extraction), closes #10 (Lemmy JSON crawler)
2026-04-22 11:00:14 -07:00

33 lines
1.1 KiB
Python

from __future__ import annotations
from pathlib import Path
from pydantic_settings import BaseSettings, SettingsConfigDict
class Settings(BaseSettings):
model_config = SettingsConfigDict(env_file=".env", env_file_encoding="utf-8", extra="ignore")
# Server
api_host: str = "0.0.0.0"
api_port: int = 8532
debug: bool = False
# Database
db_path: str = str(Path.home() / ".local" / "share" / "magpie" / "magpie.db")
# Reddit session
reddit_session_file: str = str(Path.home() / ".local" / "share" / "magpie" / "session.json")
# Scheduler
scheduler_enabled: bool = True
# Signal scraper
scraper_enabled: bool = True
scraper_interval_mins: int = 30 # how often to poll (per full pass of all subs)
scraper_request_delay_secs: float = 2.0 # pause between sub requests to respect rate limits
scraper_fetch_limit: int = 25 # posts to fetch per sub per run (max 100)
scraper_user_agent: str = "Magpie/0.1 signal-monitor (by CircuitForge)"
def get_settings() -> Settings:
return Settings()