Reddit Scraper
Overview
This skill allows you to fetch and display the latest posts from any public subreddit. It uses the Reddit JSON API (adding .json to subreddit URLs) to retrieve data without robust authentication headers, making it lightweight for simple retrieval tasks.
Capabilities
- •Fetch Latest Posts: Retrieve the top 3 newest posts (title, author, score, URL, content preview).
- •CLI Chat Interface: A simple interactive loop to query multiple subreddits in a row.
Usage
1. Interactive Chat Mode
To start an interactive session where you can query subreddits one by one:
python3 .agent/skills/reddit-scraper/scripts/chat_interface.py
2. Direct Query (Single Subreddit)
To fetch the latest posts for a specific subreddit (e.g., n8n) directly:
python3 .agent/skills/reddit-scraper/scripts/fetch_posts.py Singularity --sort top --time week --limit 3
Arguments:
- •
subreddit: Name of the subreddit. - •
--sort: Sort order (new, hot, top, rising, controversial). Default: new. - •
--time: Time filter (hour, day, week, month, year, all). Default: all. - •
--limit: Number of results. Default: 3.
3. Advanced Search (Filtered by Query)
To search for posts within a subreddit with specific filters (query, time, sort):
python3 .agent/skills/reddit-scraper/scripts/fetch_filtered_posts.py Singularity "women's fashion" --sort top --time week --limit 3
Arguments:
- •
subreddit: Name of the subreddit. - •
query: Search term. - •
--sort: Sort order (relevance, hot, top, new, comments). Default: top. - •
--time: Time filter (hour, day, week, month, year, all). Default: week. - •
--limit: Number of results. Default: 3.
Dependencies
This skill requires the requests library.
Check if it's installed:
pip list | grep requests
Install if missing:
pip install requests
Implementation Details
The scripts are located in scripts/:
- •
fetch_posts.py: Contains theget_latest_postsfunction and CLI entry point. - •
fetch_filtered_posts.py: Helper script for searching posts with filters (query, time, sort). - •
chat_interface.py: Importsfetch_postsand runs the input loop.
Note: The scripts use a custom User-Agent to avoid immediate 429 (Too Many Requests) errors from Reddit, but heavy usage might still trigger rate limits.