magpie/app/main.py
Alan Weinstock 80718e206c feat(#7,#10): signal crawler -- Reddit + Lemmy community monitoring
Implements the full signal detection pipeline:

Backend:
- app/services/lemmy/client.py: async Lemmy API v3 client, community@instance
  addressing, integer cursor dedup, normalised post dicts
- app/services/scraper.py: platform-agnostic scraper; Reddit (.json API,
  fullname cursor) + Lemmy (integer ID cursor); keyword/regex/all match modes,
  min_score gate, NormalizedPost shape, upsert dedup via UNIQUE post_id
- app/api/endpoints/signals.py: CRUD for signal_rules + signals queue;
  POST /signals/scrape manual trigger; scrape-state viewer
- migrations 010-012: signal_rules, signals, signal_scrape_state tables
- scheduler: interval job every 30 min (scraper_enabled=True in config)
- Fixed migration collision: 007_signal_rules.sql → 010, 008 → 011, 009 → 012

Frontend:
- SignalsView.vue: signal feed with status filter (new/saved/dismissed),
  keyword chips, score/comment counts, save/dismiss actions, rules editor panel
- api.ts: SignalRule, Signal types + signalRules/signals API methods
- Nav: Signals as default landing route (replaces /campaigns default)

Closes #7 (signal extraction), closes #10 (Lemmy JSON crawler)
2026-04-22 11:00:14 -07:00

79 lines
2.2 KiB
Python

from __future__ import annotations
import logging
from contextlib import asynccontextmanager
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from app.api.routes import register_routes
from app.core.config import get_settings
from app.db.store import Store
from app.services.scheduler import (
start_scheduler, stop_scheduler, sync_all_campaigns,
start_scraper_job,
)
logger = logging.getLogger(__name__)
@asynccontextmanager
async def lifespan(app: FastAPI):
settings = get_settings()
# Run DB migrations
store = Store(settings.db_path)
store.run_migrations()
# Boot scheduler and register all active campaigns
if settings.scheduler_enabled:
sched = start_scheduler()
app.state.scheduler = sched
campaigns = store.list_campaigns(active_only=True)
sync_all_campaigns(campaigns)
logger.info("Magpie started — %d campaign(s) scheduled", len(campaigns))
else:
app.state.scheduler = None
logger.info("Magpie started — scheduler disabled")
# Start signal scraper job
if settings.scraper_enabled:
if not settings.scheduler_enabled:
# Scraper needs the scheduler even if campaign scheduling is off
start_scheduler()
start_scraper_job(interval_mins=settings.scraper_interval_mins)
logger.info("Signal scraper scheduled every %d min", settings.scraper_interval_mins)
store.close()
yield
# Graceful shutdown
stop_scheduler()
def create_app() -> FastAPI:
settings = get_settings()
app = FastAPI(
title="Magpie",
description="CircuitForge cross-product social media management",
version="0.1.0",
lifespan=lifespan,
)
app.add_middleware(
CORSMiddleware,
allow_origins=["http://localhost:8531", "http://0.0.0.0:8531"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
register_routes(app)
return app
app = create_app()
if __name__ == "__main__":
import uvicorn
settings = get_settings()
uvicorn.run("app.main:app", host=settings.api_host, port=settings.api_port, reload=settings.debug)