feat: Lemmy JSON API crawler for signal extraction #10
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Build a crawler that polls Lemmy communities via the public JSON API (
/api/v3/) to detect signal threads worth replying to, and feeds them into the opportunities queue automatically.Why
Lemmy instances expose a full public REST API with no bot detection. Unlike Reddit (Playwright scraping), Lemmy signal extraction can be lightweight, reliable, and fast -- pure HTTP, no browser required.
Scope
app/services/lemmy/module: client wrappingGET /api/v3/community,GET /api/v3/post/list,GET /api/v3/posttechnology@reddthat.com,selfhosted@lemmy.world)store.create_opportunity()on matchthread_urlpoll_lemmy_communityfor on-demand scanAPI reference
Lemmy API v3 is public and unauthenticated for read operations:
GET /api/v3/post/list?community_name=<name>&sort=New&limit=50GET /api/v3/post?id=<id>(full post + comments)Instances to support initially
Notes
instance+communityseparately in community field as<community>@<instance>convention