ModuleScoreCalculator Process Configuration
Purpose: Calculate module/pathway/gene signature scores per cell using Seurat's AddModuleScore or CellCycleScoring functions.
When to Use
- •To score cells for specific gene programs (exhaustion, cytotoxicity, proliferation)
- •For pathway activity analysis using curated gene sets
- •To quantify functional states (activation, differentiation, memory)
- •To add diffusion map components for trajectory analysis
- •For cell cycle scoring to identify S and G2M phase cells
Configuration Structure
Process Enablement
[ModuleScoreCalculator] cache = true
Input Specification
[ModuleScoreCalculator.in] # Input: Seurat object from SeuratClustering srtobj = ["SeuratClustering"]
Environment Variables
[ModuleScoreCalculator.envs]
# Default parameters inherited by all modules
defaults = { nbin = 24, ctrl = 100, seed = 8525, agg = "mean" }
# Module definitions (key = module name, value = gene set parameters)
modules = {}
# Post-scoring metadata transformations
post_mutaters = {}
External References
Seurat AddModuleScore Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
features | string/list | Required | Gene names or cc.genes/cc.genes.updated.2019 for cell cycle |
nbin | int | 24 | Number of bins for aggregate expression levels of all analyzed features |
ctrl | int | 100 | Number of control features selected from same bin per analyzed feature |
k | boolean | false | Use feature clusters from DoKMeans instead of random selection |
assay | string | NULL | The assay to use (defaults to active assay) |
seed | int | 8525 | Random seed for reproducibility |
search | boolean | false | Search for symbol synonyms if features don't match |
keep | boolean | false | Keep individual feature scores (non-cell cycle only) |
agg | string | "mean" | Aggregation function: mean, median, sum, max, min, var, sd |
Reference: https://satijalab.org/seurat/reference/addmodulescore
CellCycleScoring Parameters
When using features = "cc.genes" or "cc.genes.updated.2019", adds:
- •
S.Score- S phase score per cell - •
G2M.Score- G2M phase score per cell - •
Phase- Cell cycle phase assignment (G1, S, G2M)
Reference: https://satijalab.org/seurat/reference/cellcyclescoring
Diffusion Map Parameters
{"DC": {"features": 2, "kind": "diffmap"}}
Adds first N diffusion components as metadata columns (DC_1, DC_2, ...).
Reference: https://www.rdocumentation.org/packages/destiny/versions/2.0.4/topics/DiffusionMap
Configuration Examples
Minimal Configuration
[ModuleScoreCalculator] [ModuleScoreCalculator.in] srtobj = ["SeuratClustering"]
Cell Cycle Scoring
[ModuleScoreCalculator.envs.modules]
CellCycle = { features = "cc.genes.updated.2019" }
Output columns: S.Score, G2M.Score, Phase
Exhaustion Score (T Cells)
[ModuleScoreCalculator.envs.modules.Exhaustion] features = "HAVCR2,ENTPD1,LAYN,LAG3,TIGIT,PDCD1,TOX"
Cytotoxicity Score (CD8+ T Cells, NK Cells)
[ModuleScoreCalculator.envs.modules.Cytotoxicity] features = "GZMB,PRF1,NKG7,GNLY,CTSW"
Proliferation Score
[ModuleScoreCalculator.envs.modules.Proliferation] features = "MKI67,STMN1,TUBB,PCNA,TOP2A"
Activation Score
[ModuleScoreCalculator.envs.modules.Activation] features = "IFNG,TNF,CD69,CD25"
Multiple Gene Sets (Functional States)
[ModuleScoreCalculator.envs.modules] [ModuleScoreCalculator.envs.modules.CellCycle] features = "cc.genes.updated.2019" [ModuleScoreCalculator.envs.modules.Exhaustion] features = "HAVCR2,ENTPD1,LAYN,LAG3,TIGIT,PDCD1" [ModuleScoreCalculator.envs.modules.Activation] features = "IFNG,TNF,CD69,CD25" [ModuleScoreCalculator.envs.modules.Proliferation] features = "MKI67,STMN1,TUBB,PCNA"
Diffusion Map Components
[ModuleScoreCalculator.envs.modules]
DC = { features = 2, kind = "diffmap" }
Use with: env.dimplots in SeuratClusterStats with reduction = "DC"
Post-Metadata Transformation
[ModuleScoreCalculator.envs.post_mutaters] # Calculate combined exhaustion-activation ratio Exh_Act_Ratio = "Exhaustion1 / Activation1" # Classify high vs low exhaustion Exhaustion_Level = "ifelse(Exhaustion1 > median(Exhaustion1, na.rm = TRUE), 'High', 'Low')"
Common Patterns
Pattern 1: T Cell Functional States
[ModuleScoreCalculator.envs.modules]
# Exhaustion markers (checkpoint genes)
Exhaustion = {
features = "HAVCR2,ENTPD1,LAYN,LAG3,TIGIT,PDCD1,TOX,CTLA4"
}
# Activation markers
Activation = {
features = "IFNG,TNF,CD69,CD25,IL2RA"
}
# Memory markers
Memory = {
features = "IL7R,CCR7,SELL,S100A4"
}
# Terminal differentiation
Terminal_Diff = {
features = "TIGIT,PDCD1,CD274,CD244,CD160"
}
Pattern 2: NK Cell Functional States
[ModuleScoreCalculator.envs.modules]
# Cytotoxicity
Cytotoxicity = {
features = "GZMB,PRF1,NKG7,GNLY,CTSW"
}
# Activation
NK_Activation = {
features = "NCAM1,KLRD1,FCGR3A"
}
# Exhaustion
NK_Exhaustion = {
features = "HAVCR2,LAG3,PDCD1,TIGIT"
}
Pattern 3: Cell Cycle with Custom Parameters
[ModuleScoreCalculator.envs.defaults]
nbin = 24
ctrl = 100
seed = 8525
[ModuleScoreCalculator.envs.modules]
CellCycle = {
features = "cc.genes.updated.2019"
}
Pattern 4: Metabolic Pathway Scores
[ModuleScoreCalculator.envs.modules]
# Glycolysis (Warburg effect)
Glycolysis = {
features = "HK2,PKM,LDHA,PFKL,ENO1"
}
# Oxidative phosphorylation
OXPHOS = {
features = "ND1,ND2,ND3,COX1,COX2,ATP5A1"
}
# Fatty acid oxidation
FAO = {
features = "CPT1A,ACOX1,HADHA"
}
Pattern 5: B Cell Functionality
[ModuleScoreCalculator.envs.modules]
# Plasma cell differentiation
Plasma = {
features = "MZB1,SSR4,SDC1,XBP1,PRDM1"
}
# Germinal center
Germinal_Center = {
features = "BCL6,AICDA,MEF2B"
}
# Naive vs memory
Naive = {
features = "IL7R,CCR7,IGHD"
}
Memory = {
features = "CD27,IGG1,IGHG1"
}
Gene Set Resources
MSigDB (Molecular Signatures Database)
- •URL: https://www.gsea-msigdb.org/
- •Hallmark Collection: 50 curated gene sets for biological processes
- •
HALLMARK_INTERFERON_GAMMA_RESPONSE - •
HALLMARK_TNFA_SIGNALING_VIA_NFKB - •
HALLMARK_INFLAMMATORY_RESPONSE - •
HALLMARK_HYPOXIA - •
HALLMARK_APOPTOSIS
- •
- •Immunologic Signatures (C7): Gene sets from immunology studies
- •Download: Available in GMT format for direct use
CellMarker Database
- •URL: http://bioinfo.life.hust.edu.cn/CellMarker/
- •Cell type-specific markers for human and mouse
Literature-Derived Signatures
T Cell Exhaustion Markers:
- •Primary:
HAVCR2(TIM-3),PDCD1(PD-1),LAG3,TIGIT,CTLA4 - •Transcription factors:
TOX,NR4A1,EOMES
T Cell Activation Markers:
- •Cytokines:
IFNG,TNF,IL2 - •Surface:
CD69,CD25(IL2RA),CD38
Cytotoxicity Markers:
- •Granzymes:
GZMB,GZMA,GZMH - •Perforin:
PRF1 - •NK receptors:
NKG7,GNLY,CTSW
Proliferation Markers:
- •Ki-67:
MKI67 - •Tubulin:
STMN1,TUBB - •PCNA:
PCNA,TOP2A
Cell Cycle Genes (Seurat built-in):
- •
cc.genes- Original Tirosh et al. 2016 gene set - •
cc.genes.updated.2019- Updated with 2019 gene symbols
Dependencies
Upstream Processes
- •Required:
SeuratClustering- Provides the Seurat object - •Optional:
TOrBCellSelection- If working with T/B cell subsets
Downstream Processes
- •
SeuratClusterStats- Visualize module scores across clusters - •
CellCellCommunication- Correlate scores with cell interactions - •
ScFGSEA- Validate module activity with enrichment analysis
Validation Rules
Gene Set Format Validation
- •Comma-separated strings:
"GENE1,GENE2,GENE3"✓ - •Cell cycle keywords:
"cc.genes"or"cc.genes.updated.2019"✓ - •Diffusion map:
{"features": N, "kind": "diffmap"}✓
Gene Name Matching
- •Human genes: Uppercase (
MKI67,IFNG) ✓ - •Mouse genes: Title case (
Mki67,Ifng) ✓ - •Search mode: Set
search = trueto automatically find synonyms - •Keep mode: Set
keep = trueto retain unmatched features
Parameter Constraints
- •
nbin: Typically 10-50 (default 24) - •
ctrl: Typically 10-500 (default 100) - •Minimum genes: ≥5 genes recommended for robust scoring
- •Maximum genes: No hard limit, but performance may degrade >1000 genes
Troubleshooting
Issue: Genes Not Found in Object
Symptom: Warning "XX% of features not found in object"
Solutions:
- •Check gene name format (uppercase for human, title case for mouse)
- •Enable
search = trueto find symbol synonyms - •Verify gene symbols match your Seurat object's row names
- •Use
search = true+keep = trueto debug missing genes
Issue: Too Few Genes in Set
Symptom: Module score is NA or unreliable
Solutions:
- •Ensure ≥5 genes in gene set for robust scoring
- •Add alternative markers to expand gene set
- •Check if genes are expressed in your dataset
- •Use
keep = trueto see how many genes matched
Issue: Cell Cycle Score All G1
Symptom: Most cells classified as G1 phase
Solutions:
- •Check if cells are truly non-proliferating (e.g., memory T cells)
- •Verify data quality (low UMI counts may obscure cell cycle)
- •Consider using
cc.genesinstead ofcc.genes.updated.2019 - •Check
S.ScoreandG2M.Scorevalues directly
Issue: Module Scores All Similar
Symptom: No variation in scores across cells
Solutions:
- •Genes may be uniformly expressed or not detected
- •Try adjusting
nbinandctrlparameters - •Verify assay selection (
assay = "RNA"vs"SCT") - •Check if cells express the expected markers
Issue: Diffusion Map Components Not Added
Symptom: DC_1, DC_2 columns missing
Solutions:
- •Ensure
destinyR package is installed - •Verify
SingleCellExperimentpackage is available - •Use correct format:
{"DC": {"features": 2, "kind": "diffmap"}} - •Requires R packages:
SingleCellExperiment,destiny
Best Practices
Gene Set Selection
- •Use literature-validated signatures when possible
- •Combine complementary markers (e.g., exhaustion:
HAVCR2+PDCD1+LAG3) - •Consider species-specific marker expression patterns
- •Test gene sets on a subset before full pipeline run
Parameter Tuning
- •
nbin = 24: Default works well for most datasets - •
ctrl = 100: Increase if many genes have similar expression levels - •
seed = 8525: Keep fixed for reproducibility across runs - •
agg = "mean": Usemedianfor outlier-resistant aggregation
Visualization
- •Use
SeuratClusterStats.envs.dimplotsto visualize scores - •Add to
SeuratClusterStats.envs.violinsfor distribution plots - •Correlate scores with clusters or annotations
- •Consider
post_mutatersfor custom score transformations
Performance
- •Module scoring is computationally cheap (<5 min for typical datasets)
- •Larger gene sets (>1000 genes) may take longer
- •Diffusion map computation scales with cell number (O(n²))
Integration Example
# Complete workflow with multiple scores
[ModuleScoreCalculator.envs.defaults]
nbin = 24
ctrl = 100
seed = 8525
[ModuleScoreCalculator.envs.modules]
# Cell cycle
CellCycle = { features = "cc.genes.updated.2019" }
# T cell function
Exhaustion = { features = "HAVCR2,ENTPD1,LAYN,LAG3,TIGIT,PDCD1,TOX,CTLA4" }
Activation = { features = "IFNG,TNF,CD69,CD25" }
Memory = { features = "IL7R,CCR7,SELL,S100A4" }
# Cytotoxicity
Cytotoxicity = { features = "GZMB,PRF1,NKG7,GNLY" }
# Metabolism
Glycolysis = { features = "HK2,PKM,LDHA,PFKL,ENO1" }
[ModuleScoreCalculator.envs.post_mutaters]
# Classify T cell states
Tcell_State = """
case_when(
Exhaustion1 > median(Exhaustion1, na.rm = TRUE) ~ 'Exhausted',
Activation1 > median(Activation1, na.rm = TRUE) ~ 'Activated',
Memory1 > median(Memory1, na.rm = TRUE) ~ 'Memory',
TRUE ~ 'Naive'
)
"""
# Combined functional score
Functionality = "(Activation1 + Cytotoxicity1) / (Exhaustion1 + 1)"
Notes
- •Process is optional: Only runs when
[ModuleScoreCalculator]section exists in config - •Multiple modules: Define any number of modules in
modulesdictionary - •Column naming: Scores stored as
ModuleName1,ModuleName2, etc. - •Cell cycle special case: Uses
CellCycleScoring()which addsS.Score,G2M.Score,Phase - •Diffusion map: Special module type for trajectory analysis
- •Post-processing: Use
post_mutatersfor custom metadata calculations - •Visualization: Scores available in
SeuratClusterStatsfor plotting
Quick Reference
Gene Set Formats:
# Comma-separated features = "GENE1,GENE2,GENE3" # Cell cycle (built-in) features = "cc.genes.updated.2019" # Diffusion map (special) features = 2 kind = "diffmap"
Common Gene Sets:
Exhaustion = "HAVCR2,PDCD1,LAG3,TIGIT,CTLA4,TOX" Cytotoxicity = "GZMB,PRF1,NKG7,GNLY" Proliferation = "MKI67,STMN1,PCNA,TOP2A" Activation = "IFNG,TNF,CD69,CD25" Memory = "IL7R,CCR7,SELL"
Process Location: /immunopipe/processes.py (line 455)
Documentation: /docs/processes/ModuleScoreCalculator.md
Function: Seurat::AddModuleScore(), Seurat::CellCycleScoring()