--- frontmattername: enact/firecrawl
version: 1.2.0
description: Scrape, crawl, search, and extract structured data from websites using Firecrawl API - converts web pages to LLM-ready markdown
enact: "2.0"
from: python:3.12-slim
build:
- pip install requests
env:
FIRECRAWL_API_KEY:
description: Your Firecrawl API key from firecrawl.dev
secret: true
command: python /workspace/firecrawl.py ${action} ${url} ${formats} ${limit} ${only_main_content} ${prompt} ${schema}
timeout: 300s
license: MIT
tags:
- web-scraping
- crawling
- markdown
- llm
- ai
- data-extraction
- search
- structured-data
annotations:
readOnlyHint: true
openWorldHint: true
inputSchema:
type: object
properties:
action:
type: string
description: |
The action to perform:
- scrape: Extract content from a single URL
- crawl: Discover and scrape all subpages of a website
- map: Get all URLs from a website (fast discovery)
- search: Search the web and get scraped results
- extract: Extract structured data using AI
enum:
- scrape
- crawl
- map
- search
- extract
default: scrape
url:
type: string
description: The URL to process (for scrape, crawl, map, extract) or search query (for search action)
formats:
type: string
description: Comma-separated output formats (markdown, html, links, screenshot). Used by scrape and crawl actions.
default: markdown
limit:
type: integer
description: Maximum number of pages to crawl (crawl action) or search results to return (search action)
default: 10
only_main_content:
type: boolean
description: Extract only the main content, excluding headers, navs, footers (scrape action)
default: true
prompt:
type: string
description: |
Multi-purpose field:
- For map: Search query to filter URLs
- For extract: Natural language instruction for what to extract
default: ""
schema:
type: string
description: JSON schema string for structured extraction (extract action only). Define the shape of data you want to extract.
default: ""
required:
- url
outputSchema:
type: object
properties:
success:
type: boolean
description: Whether the operation succeeded
action:
type: string
description: The action that was performed
url:
type: string
description: The URL or query that was processed
data:
type: object
description: The scraped/crawled/extracted data including markdown, metadata, and structured content
error:
type: string
description: Error message if the operation failed
examples:
- input:
url: "https://example.com"
action: "scrape"
description: Scrape a single page and get markdown
- input:
url: "https://docs.example.com"
action: "crawl"
limit: 5
description: Crawl a documentation site (up to 5 pages)
- input:
url: "https://example.com"
action: "map"
description: Get all URLs from a website
- input:
url: "latest AI news"
action: "search"
limit: 5
description: Search the web and get scraped results
- input:
url: "https://news.ycombinator.com"
action: "extract"
prompt: "Extract the top 5 news headlines with their URLs and point counts"
description: Extract structured data from a page using AIFirecrawl Web Scraping Tool
A powerful web scraping tool that uses the Firecrawl API to convert websites into clean, LLM-ready markdown and extract structured data.
Features
- •Scrape: Extract content from a single URL as markdown, HTML, or with screenshots
- •Crawl: Automatically discover and scrape all accessible subpages of a website
- •Map: Get a list of all URLs from a website without scraping content (extremely fast)
- •Search: Search the web and get full scraped content from results
- •Extract: Use AI to extract structured data from pages with natural language prompts
Setup
- •Get an API key from firecrawl.dev
- •Set your API key as a secret:
enact env set FIRECRAWL_API_KEY <your-api-key> --secret --namespace enact
This stores your API key securely in your OS keyring (macOS Keychain, Windows Credential Manager, or Linux Secret Service).
Usage Examples
Scrape a single page
enact run enact/firecrawl --url "https://example.com" --action scrape
Crawl an entire documentation site
enact run enact/firecrawl --url "https://docs.example.com" --action crawl --limit 20
Map all URLs on a website
enact run enact/firecrawl --url "https://example.com" --action map
Search the web
enact run enact/firecrawl --url "latest AI developments 2024" --action search --limit 5
Extract structured data with AI
enact run enact/firecrawl --url "https://news.ycombinator.com" --action extract --prompt "Extract the top 10 news headlines with their URLs"
Extract with a JSON schema
enact run enact/firecrawl \
--url "https://example.com/pricing" \
--action extract \
--prompt "Extract pricing information" \
--schema '{"type":"object","properties":{"plans":{"type":"array","items":{"type":"object","properties":{"name":{"type":"string"},"price":{"type":"string"}}}}}}'
Output
The tool returns JSON with:
- •markdown: Clean, LLM-ready content
- •metadata: Title, description, language, source URL
- •extract: Structured data (for extract action)
- •links: Discovered URLs (for map action)
API Features
Firecrawl handles the hard parts of web scraping:
- •Anti-bot mechanisms
- •Dynamic JavaScript content
- •Proxies and rate limiting
- •PDF and document parsing
- •Screenshot capture