Fal Add Model
Create inference.sh apps that wrap fal.ai model endpoints.
IMPORTANT: Follow these steps IN ORDER. Do not skip steps.
Quick Start (Recommended)
Use the scaffold script to automate Steps 0-4:
cd /home/ok/inference/grid/api/falai # For new apps (generates in current dir, move files after init) ./fal-scaffold.sh fal-ai/MODEL_NAME # For existing apps (generates directly in app folder) ./fal-scaffold.sh fal-ai/MODEL_NAME app-folder
This will:
- •Search for related endpoints (consolidation candidates)
- •Fetch OpenAPI schema
- •Fetch pricing
- •Generate MODEL.md and PRICING.md in the specified directory
Then continue from Step 5.
Checklist
[ ] Step 0: Search for related endpoints (consolidation check) [ ] Step 1: Fetch model OpenAPI schema [ ] Step 2: Write MODEL.md with schema details [ ] Step 3: Fetch pricing data [ ] Step 4: Write PRICING.md with CEL expressions [ ] Step 5: Run `infsh app init` [ ] Step 6: Implement inference.py using MODEL.md as reference [ ] Step 7: Test with `infsh app test` [ ] Step 8: Update IMPLEMENTED_MODELS.md [ ] Step 9: Deploy with `infsh deploy`
Step 0: Search for Related Endpoints
STOP. Do this FIRST to check for consolidation opportunities.
Extract the model family name and search:
# Example: for "fal-ai/dia-tts", search "dia" curl -s "https://api.fal.ai/v1/models?q=MODEL_FAMILY&limit=20"
Look for related endpoints that should be consolidated:
- •
model/text-to-video+model/image-to-video→ single app - •
model-tts+model-voice-clone→ single app with mode selection - •
model/pro+model/turbo→ single app with quality enum
Decision: If related endpoints exist, decide NOW whether to:
- •Build a single consolidated app (preferred)
- •Build separate apps (if schemas are very different)
Step 1: Fetch Model OpenAPI Schema
curl -s "https://api.fal.ai/v1/models?endpoint_id=fal-ai/MODEL_NAME&expand=openapi-3.0"
For multiple related endpoints (to merge):
curl -s "https://api.fal.ai/v1/models?endpoint_id=fal-ai/model/text-to-video&endpoint_id=fal-ai/model/image-to-video&expand=openapi-3.0"
Save the response. You need:
- •
metadata.description- Model description - •
metadata.category- Category (text-to-image, image-to-video, etc.) - •
openapi.components.schemas.*Input- Input schema - •
openapi.components.schemas.*Output- Output schema
Step 2: Write MODEL.md
STOP. Do not proceed to implementation until MODEL.md exists.
Create MODEL.md in a scratchpad or the app directory with this structure:
# Model: fal-ai/model-name ## Endpoint `fal-ai/model-name` ## Category [from metadata.category] ## Description [from metadata.description] ## Input Schema ### Required Fields - `field_name` (type): Description [from schema] ### Optional Fields - `field_name` (type, default: X): Description [from schema] ## Output Schema - `output.field` (type): Description ## Notes - [Any special handling needed] - [Constraints or limitations]
For multiple endpoints being merged, create separate files:
- •
MODEL_TEXT_TO_VIDEO.md - •
MODEL_IMAGE_TO_VIDEO.md
Step 3: Fetch Pricing
curl -s -H "Authorization: Key $FAL_KEY" \ "https://api.fal.ai/v1/models/pricing?endpoint_id=fal-ai/MODEL_NAME"
Response format:
{
"prices": [{
"endpoint_id": "fal-ai/model-name",
"unit_price": 0.025,
"unit": "image",
"currency": "USD"
}]
}
Step 4: Write PRICING.md
STOP. Do not proceed until PRICING.md exists.
Use the fal-pricing skill for CEL expression help. Create PRICING.md:
# Pricing: app-name ## fal.ai Base Price - Endpoint: `fal-ai/model-name` - Price: $X.XX per [unit] - Currency: USD ## Price Variables (microcents) - `per_[unit]`: [value] (calculation: $X.XX * 100000000) ## CEL Expressions ### inference_expression ```cel [expression from fal-pricing skill patterns]
pricing_description
"$X.XX per [unit]"
Calculation Notes
[How fal price maps to our pricing model]
--- ## Step 5: Initialize App Now you can create the app structure: ```bash cd /home/ok/inference/grid/api/falai infsh app init my-app-name cd my-app-name cp ../fal_helper.py .
Move MODEL.md and PRICING.md into the app directory for reference.
Step 6: Implement
With MODEL.md and PRICING.md as your reference:
6a. Update inf.yml
namespace: falai
name: my-app-name
description: [from MODEL.md description]
category: [video|image|audio|other]
kernel: python-3.11
resources:
gpu:
count: 0
vram: 0
type: none
ram: 4000000000
env: {}
secrets:
- key: FAL_KEY
description: fal.ai API key for model access
optional: false
6b. Update requirements.txt
pydantic >= 2.0.0 inferencesh fal-client>=0.4.0 requests>=2.28.0
6c. Update init.py
from .inference import App, AppInput, AppOutput __all__ = ["App", "AppInput", "AppOutput"]
6d. Write inference.py
Use references/template.md as the code template. Map fields from MODEL.md to your AppInput/AppOutput classes.
Key patterns:
- •Use
Field(description="...")from MODEL.md descriptions - •Use
Enumclasses for constrained values - •Use
Optional[X] = Field(default=None, ...)for optional fields - •Use
.urifor file URLs sent to fal.ai (not.path)
Step 7: Test
cd my-app-name # Generate example input file infsh app test --save-example # Edit input.json with test values, then run: export FAL_KEY=$(cat ../.fal.key) infsh app test --input-file input.json
Step 8: Update Tracking
Add entry to ../../IMPLEMENTED_MODELS.md:
| my-app-name | fal-ai/model-name | category |
Update the "Last Updated" date.
Step 9: Deploy
infsh deploy
Consolidation Patterns
When fal.ai has multiple related endpoints:
| Pattern | Our Approach |
|---|---|
| text-to-X + image-to-X | Single app, detect image input |
| Multiple languages | Single app with language enum |
| Quality variants (pro/turbo) | Single app with quality enum |
Consolidation code pattern:
def _get_model_id(self, input_data: AppInput) -> str:
if input_data.image:
return "fal-ai/model/image-to-video"
return "fal-ai/model/text-to-video"
Resources
- •Code template:
references/template.md - •Pricing help: Use
fal-pricingskill - •Model search: Use
fal-model-searchskill - •fal_helper.py:
../../fal_helper.py