Brain Feature Implementation
Description
This skill automates the process of converting a WorldQuant Brain idea document (Markdown) into actionable Alpha expressions. It handles dataset downloading and code generation for each distinct idea pattern.
Scope of Work
- •This skill operates exclusively by manipulating local CSV files using the provided Python scripts.
- •Do NOT use any WorldQuant Brain MCP tools (e.g.,
brain-api). - •Do NOT write custom Python scripts (e.g.
python -c ...or new.pyfiles) to check data or generate expressions. You MUST use thescripts/implement_idea.pytool. - •Do not attempt to submit alphas or run simulations on the platform. Focus only on generating the expression files locally.
Instructions
- •
Analyze the Idea Document
- •Read the provided markdown file.
- •Extract the following metadata:
- •Dataset ID (e.g.,
analyst15) - •Region (e.g.,
GLB) - •Delay (e.g.,
1or0)
- •Dataset ID (e.g.,
- •If any metadata is missing, ask the user to clarify.
- •
Download Dataset
- •Execute the fetch script using the extracted parameters.
- •Locate Scripts:
- •Check your current working directory (
ls -RorGet-ChildItem -Recurse). - •Find the path to
fetch_dataset.py. It is likely inbrain-feature-implementation/scriptsorscripts.
- •Check your current working directory (
- •Run Command:
- •Change directory to the folder containing the script before running it.
- •Command:
bash
cd <PATH_TO_SCRIPTS_FOLDER> && python fetch_dataset.py --datasetid <ID> --region <REGION> --delay <DELAY>
- •Wait for the download to complete. The script will create a folder in
../data/.
- •
Plan Implementation
- •Scan the markdown file for Feature Definitions or Formulas.
- •Look for patterns like
Definition: <formula>or code blocks describing math. - •Use the
manage_todo_listtool to create a plan with one entry for each unique idea/formula found.- •Title: The Idea Name or ID (e.g., "3.1.1 Estimate Stability Score").
- •Description: The specific template formula (e.g.,
template: "{st_dev} / abs({mean})").
- •
Execute Implementation
- •For each item in the Todo List:
- •Construct the Template:
- •Use Python format string syntax
{variable}. - •The
{variable}must match the suffix of the fields in the dataset (e.g.,mean,st_dev,gro). - •CRITICAL: Do NOT include the full prefix or horizon in the template. The script auto-detects these.
- •Correct Example: For
anl15_gr_12_m_gro / anl15_gr_12_m_pe, use template:{gro} / {pe}. - •Incorrect Example:
{anl15_gr_12_m_gro} / {pe}(Includes prefix). - •Incorrect Example:
${gro} / ${pe}(Shell syntax).
- •Use Python format string syntax
- •Determine Dataset Folder:
{ID}_{REGION}_delay{DELAY}(e.g.,analyst10_GLB_delay1). - •Run Script:
- •Navigate to the folder containing
implement_idea.py(as identified in step 2). - •Command:
bash
cd <PATH_TO_SCRIPTS_FOLDER> && python implement_idea.py --template "<TEMPLATE_STRING>" --dataset "<DATASET_FOLDER_NAME>"
- •Note: The script ONLY accepts
--templateand--dataset. Do not pass any other arguments like--filtersor--groupby. - •Strict Rule: Do NOT use
python -cor create temporary scripts to verify or process results. Trust the output ofimplement_idea.py.
- •Navigate to the folder containing
- •Verify the output (number of expressions generated).
- •Mark the Todo item as completed.
- •Construct the Template:
- •For each item in the Todo List:
- •
Finalize Output
- •After all Todo items are completed, merge all generated expressions into a single file.
- •Run Merge Script:
- •Navigate to the folder containing scripts.
- •Command:
bash
cd <PATH_TO_SCRIPTS_FOLDER> && python merge_expression_list.py --dataset "<DATASET_FOLDER_NAME>"
- •This will create
final_expressions.jsonin the dataset directory. - •Report the total number of unique expressions and the path to the final file to the user.
Script Dependencies
This skill relies on the following scripts in its scripts/ directory:
- •
fetch_dataset.py: Downloads data from Brain API. - •
implement_idea.py: Generates alpha expressions from templates. - •
ace_lib.py&helpful_functions.py: Support libraries.