Modal Rules and Guidelines for LLMs
This file provides rules and guidelines for LLMs when implementing Modal code.
General
- •Modal is a serverless cloud platform for running Python code with minimal configuration
- •Designed for AI/ML workloads but supports general-purpose cloud compute
- •Serverless billing model - you only pay for resources used
Modal documentation
- •Extensive documentation is available at: modal.com/docs (and in markdown format at modal.com/llms-full.txt)
- •A large collection of examples is available at: modal.com/docs/examples (and github.com/modal-labs/modal-examples)
- •Reference documentation is available at: modal.com/docs/reference
Always refer to documentation and examples for up-to-date functionality and exact syntax.
Core Modal concepts
App
- •A group of functions, classes and sandboxes that are deployed together.
Function
- •
The basic unit of serverless execution on Modal.
- •
Each Function executes in its own container, and you can configure different Images for different Functions within the same App:
pythonimage = ( modal.Image.debian_slim(python_version="3.12") .pip_install("torch", "transformers") .apt_install("ffmpeg") .run_commands("mkdir -p /models") ) @app.function(image=image) def square(x: int) -> int: return x * x - •
You can configure individual hardware requirements (CPU, memory, GPUs, etc.) for each Function.
python@app.function( gpu="H100", memory=4096, cpu=2, ) def inference(): ...
Some examples specificly for GPUs:
python@app.function(gpu="A10G") # Single GPU, e.g. T4, A10G, A100, H100, or "any" @app.function(gpu="A100:2") # Multiple GPUs, e.g. 2x A100 GPUs @app.function(gpu=["H100", "A100", "any"]) # GPU with fallbacks
- •
Functions can be invoked in a number of ways. Some of the most common are:
- •
foo.remote()- Run the Function in a separate container in the cloud. This is by far the most common. - •
foo.local()- Run the Function in the same context as the caller. Note: This does not necessarily mean locally on your machine. - •
foo.map()- Parallel map over a set of inputs. - •
foo.spawn()- Calls the function with the given arguments, without waiting for the results. Terminating the App will also terminate spawned functions.
- •
- •
Web endpoint: You can turn any Function into an HTTP web endpoint served by adding a decorator:
python@app.function() @modal.fastapi_endpoint() def fastapi_endpoint(): return {"status": "ok"} @app.function() @modal.asgi_app() def asgi_app(): app = FastAPI() ... return app - •
You can run Functions on a schedule using e.g.
@app.function(schedule=modal.Period(minutes=5))or@app.function(schedule=modal.Cron("0 9 * * *")).
Classes (a.k.a. Cls)
- •
For stateful operations with startup/shutdown lifecycle hooks. Example:
python@app.cls(gpu="A100") class ModelServer: @modal.enter() def load_model(self): # Runs once when container starts self.model = load_model() @modal.method() def predict(self, text: str) -> str: return self.model.generate(text) @modal.exit() def cleanup(self): # Runs when container stops cleanup()
Other important concepts
- •Image: Represents a container image that Functions can run in.
- •Sandbox: Allows defining containers at runtime and securely running arbitrary code inside them.
- •Volume: Provide a high-performance distributed file system for your Modal applications.
- •Secret: Enables securely providing credentials and other sensitive information to your Modal Functions.
- •Dict: Distributed key/value store, managed by Modal.
- •Queue: Distributed, FIFO queue, managed by Modal.
Differences from standard Python development
- •Modal always executes code in the cloud, even while you are developing. You can use Environments for separating development and production deployments.
- •Dependencies: It's common and encouraged to have different dependency requirements for different Functions within the same App. Consider defining dependencies in Image definitions (see Image docs) that are attached to Functions, rather than in global
requirements.txt/pyproject.tomlfiles, and puttingimportstatements inside the Functiondef. Any code in the global scope needs to be executable in all environments where that App source will be used (locally, and any of the Images the App uses).
Modal coding style
- •Modal Apps, Volumes, and Secrets should be named using kebab-case.
- •Always use
import modal, and qualified names likemodal.App(),modal.Image.debian_slim(). - •Modal evolves quickly, and prints helpful deprecation warnings when you
modal runan App that uses deprecated features. When writing new code, never use deprecated features.
Common commands
Running modal --help gives you a list of all available commands. All commands also support --help for more details.
Running your Modal app during development
- •
modal run path/to/your/app.py- Run your app on Modal. - •
modal run -m module.path.to.app- Run your app on Modal, using the Python module path. - •
modal serve modal_server.py- Run web endpoint(s) associated with a Modal app, and hot-reload code on changes. Will print a URL to the web endpoint(s). Note: you need to useCtrl+Cto interruptmodal serve.
Deploying your Modal app
- •
modal deploy path/to/your/app.py- Deploy your app (Functions, web endpoints, etc.) to Modal. - •
modal deploy -m module.path.to.app- Deploy your app to Modal, using the Python module path.
Logs:
- •
modal app logs <app_name>- Stream logs for a deployed app. Note: you need to useCtrl+Cto interrupt the stream.
Resource management
- •There are CLI commands for interacting with resources like
modal app list,modal volume list, and similarly forsecret,dict,queue, etc. - •These also support other command than
list- use e.g.modal app --helpfor more.
Testing and debugging
- •When using
app.deploy(), you can wrap it in awith modal.enable_output():block to get more output.