SeuratClusterStats Process Configuration
Purpose
Generates comprehensive cluster statistics and visualizations for Seurat objects, including dimension reduction plots, gene expression visualizations, cluster quality metrics, and clustree diagrams. This process is essential for exploring and validating clustering results.
When to Use
- •After:
SeuratClusteringorSeuratSubClusteringprocesses - •Use cases:
- •Cluster quality assessment and validation
- •Visualizing cluster characteristics across dimensions
- •Comparing marker gene expression between clusters
- •Assessing cluster stability via clustree plots
- •Exploring metadata relationships with clusters
- •Always enabled in immunopipe TCR and non-TCR workflows (order = -1, runs early)
Configuration Structure
Process Enablement
[SeuratClusterStats] cache = true
Input Specification
[SeuratClusterStats.in] srtobj = ["SeuratClustering"]
Note: srtobj accepts the output name from SeuratClustering or SeuratSubClustering.
Environment Variables
Global Settings
[SeuratClusterStats.envs]
# Mutate metadata before plotting
mutaters = {}
# Cache feature plots (time-consuming)
cache = "/tmp"
Clustree Plots
Visualize clustering resolution relationships.
[SeuratClusterStats.envs.clustrees_defaults]
prefix = true # Auto-detect clustering columns
devpars = {res = 100, width = 800, height = 500}
more_formats = []
save_code = false
Clustree cases:
[SeuratClusterStats.envs.clustrees."Custom Clustree"]
prefix = "seurat_clusters"
devpars = {height = 600}
Cluster Statistics (stats)
Cell count and fraction plots across clusters.
[SeuratClusterStats.envs.stats_defaults]
subset = ""
devpars = {res = 100, height = 600, width = 800}
descr = ""
more_formats = []
save_code = false
save_data = false
Plot types for stats (via scplotter::CellStatPlot):
- •
bar- Bar chart - •
circos- Circos plot (chord diagram) - •
pie- Single pie chart - •
ring/donut- Ring/donut chart - •
trend- Trend plot - •
area- Area plot - •
sankey/alluvial- Sankey/alluvial diagram - •
heatmap- Heatmap - •
radar- Radar plot - •
spider- Spider plot - •
violin- Violin plot - •
box- Box plot
Default cases:
[SeuratClusterStats.envs.stats]
"Number of cells in each cluster (Bar Chart)" = {plot_type = "bar", x_text_angle = 90}
"Number of cells in each cluster by Sample (Bar Chart)" = {plot_type = "bar", group_by = "Sample", x_text_angle = 90}
Custom stat example:
[SeuratClusterStats.envs.stats."Cells by Diagnosis"] plot_type = "bar" group_by = "Diagnosis" frac = "group" # Options: "none", "group", "ident", "cluster", "all" x_text_angle = 90 swap = true position = "stack"
Gene Count Visualization (ngenes)
Number of genes detected per cell.
[SeuratClusterStats.envs.ngenes_defaults]
more_formats = []
subset = ""
devpars = {res = 100, height = 800, width = 1000}
Default case:
[SeuratClusterStats.envs.ngenes]
"Number of genes detected in each cluster" = {}
Feature Visualization (features)
Gene expression and metadata column plots.
[SeuratClusterStats.envs.features_defaults]
# Feature specification (multiple formats)
features = ["CD3D", "CD4", "CD8A"] # OR
# features = "file://path/to/genes.txt" # OR
# features = 10 # Top N variant features
# Cluster ordering
order_by = "desc(mean(Expression, na.rm = TRUE))" # OR
# order_by = ["c1", "c2", "c3"] # Literal order
subset = ""
devpars = {res = 100}
descr = ""
more_formats = []
save_code = false
save_data = false
Feature plot types (via scplotter::FeatureStatPlot):
- •
violin- Violin plot - •
box- Box plot - •
bar- Bar plot - •
ridge- Ridge plot - •
dim- Dimension reduction plot - •
cor- Correlation plot - •
heatmap- Heatmap - •
dot- Dot plot (heatmap shortcut)
Common feature parameters:
- •
plot_type- Type of visualization - •
ident- Identity column (e.g., "seurat_clusters", "Diagnosis") - •
group_by- Group cells by metadata column - •
split_by- Split into multiple plots - •
facet_by- Facet plots by metadata - •
add_box- Add box plot overlay (violin/ridge) - •
add_point- Add jittered points - •
add_bg- Add background reference - •
stack- Stack multiple features - •
flip- Flip plot orientation - •
comparisons- Add statistical comparisons
Dimension Reduction Plots (dimplots)
UMAP/tSNE/PCA visualizations.
[SeuratClusterStats.envs.dimplots_defaults]
group_by = null
split_by = null
subset = ""
devpars = {res = 100}
reduction = "dim" # Options: "dim", "auto", "umap", "tsne", "pca"
Reduction options:
- •
dim- Auto-detect: UMAP → tSNE → PCA (uses sub_umap for subclusters) - •
auto- Same asdim - •
umap- Force UMAP - •
tsne- Force tSNE - •
pca- Force PCA
Common dimplot parameters:
- •
label- Add cluster labels - •
label_size- Label font size - •
label_repel- Repel overlapping labels - •
add_mark- Add cluster boundaries (options: hull, ellipse, rect, circle) - •
mark_alpha- Mark transparency - •
mark_linetype- Mark line type - •
hex- Use hexagonal binning - •
hex_bins- Number of hex bins - •
stat_by- Add statistics by metadata - •
stat_plot_type- pie, ring, bar, line - •
stat_plot_size- Size of stat plot - •
facet_by- Facet by metadata - •
highlight- Highlight specific cells
Default cases:
[SeuratClusterStats.envs.dimplots]
"Dimensional reduction plot" = {label = true}
"VDJ Presence" = {group_by = "VDJ_Presence"} # Only if TCR data present
External References
Plotthis Plot Types
Dimension Reduction:
- •
DimPlot: UMAP/tSNE/PCA visualization- •
dims- Dimensions to plot (default: 1:2) - •
pt_size- Point size - •
alpha- Point transparency - •
label- Add cluster labels - •
highlight- Highlight cells - •
add_density- Add density layer - •
hex- Hexagonal binning
- •
Statistical Plots:
- •
ViolinPlot: Distribution with density- •
add_box- Add box overlay - •
add_point- Add points - •
add_trend- Add trend line - •
flip- Horizontal orientation
- •
- •
BoxPlot: Box and whisker plots- •
add_jitter- Add jittered points - •
add_violin- Add violin overlay
- •
- •
BarPlot: Bar charts- •
position- "stack", "dodge", "fill" - •
x_text_angle- X-axis text rotation - •
swap- Swap x and fill aesthetics
- •
- •
RidgePlot: Ridge (joy) plots- •
flip- Horizontal orientation
- •
Heatmaps:
- •
Heatmap: Gene expression heatmaps- •
cell_type- "tile", "dot", "violin", "boxplot", "bar", "pie" - •
cluster_rows- Cluster rows - •
cluster_columns- Cluster columns - •
rows_split_by- Split rows by metadata - •
columns_split_by- Split columns by metadata - •
flip- Transpose heatmap - •
palette- Color palette (e.g., "viridis", "YlOrRd", "Spectral") - •
column_annotation- Add column annotations (list of column names) - •
column_annotation_type- Annotation types (simple, violin, pie, ring, bar) - •
dot_size- Function for dot size (e.g., function(x) sum(x > 0) / length(x)) - •
dot_size_name- Legend name for dot size - •
add_reticle- Add grid lines - •
add_bg- Add background
- •
Advanced Visualizations:
- •
CircosPlot: Chord/circos diagram - •
SankeyPlot: Sankey/alluvial diagram- •
links_alpha- Link transparency - •
group_by- Node columns (list for multiple nodes)
- •
Device Parameters
Common to all plot types:
devpars = {
res = 100, # Resolution in DPI
width = 800, # Width in pixels
height = 600 # Height in pixels
}
Configuration Examples
Minimal Configuration
[SeuratClusterStats] cache = true [SeuratClusterStats.in] srtobj = ["SeuratClustering"]
Standard QC Plots
[SeuratClusterStats.envs.stats."Number of cells per cluster"] plot_type = "bar" x_text_angle = 90 [SeuratClusterStats.envs.stats."Cells by Sample"] plot_type = "bar" group_by = "Sample" x_text_angle = 90
Gene Expression Visualization
[SeuratClusterStats.envs.features_defaults] features = ["CD3D", "CD4", "CD8A", "MS4A1", "CD14", "LYZ", "FCGR3A", "NCAM1", "KLRD1"] [SeuratClusterStats.envs.features."T cell markers (violin)"] plot_type = "violin" ident = "seurat_clusters" add_box = true [SeuratClusterStats.envs.features."T cell markers (ridge)"] plot_type = "ridge" ident = "seurat_clusters" flip = true [SeuratClusterStats.envs.features."Marker on UMAP"] plot_type = "dim" feature = "CD4" highlight = "seurat_clusters == 'c1'"
Heatmap with Annotations
[SeuratClusterStats.envs.features."Marker heatmap"]
features = {
"T cell markers" = ["CD3D", "CD4", "CD8A"],
"B cell markers" = ["MS4A1"],
"Monocyte markers" = ["CD14", "LYZ", "FCGR3A"],
"NK cell markers" = ["NCAM1", "KLRD1"]
}
plot_type = "heatmap"
ident = "Diagnosis"
columns_split_by = "seurat_clusters"
name = "Expression"
devpars = {height = 560}
cell_type = "dot"
dot_size = "nanmean"
dot_size_name = "Percent Expressed"
column_annotation = ["percent.mt", "VDJ_Presence"]
column_annotation_type = {percent.mt = "violin", VDJ_Presence = "pie"}
devpars = {width = 1400, height = 900}
Advanced Dimplot
[SeuratClusterStats.envs.dimplots."UMAP with labels"] label = true [SeuratClusterStats.envs.dimplots."UMAP with marks"] add_mark = true mark_linetype = 2 [SeuratClusterStats.envs.dimplots."UMAP by Diagnosis"] facet_by = "Diagnosis" highlight = true theme = "theme_blank" [SeuratClusterStats.envs.dimplots."UMAP with hex bins"] hex = true hex_bins = 50 [SeuratClusterStats.envs.dimplots."UMAP with stat"] stat_by = "Diagnosis" stat_plot_type = "ring" stat_plot_size = 0.15
Common Patterns
Pattern 1: Basic UMAP Visualization
[SeuratClusterStats.envs.dimplots."Basic UMAP"] label = true reduction = "umap"
Pattern 2: QC Metrics per Cluster
[SeuratClusterStats.envs.ngenes."Genes per cluster"] plot_type = "violin" add_box = true add_point = true [SeuratClusterStats.envs.stats."QC stats"] plot_type = "bar" group_by = "percent.mt_bin" x_text_angle = 90
Pattern 3: Custom Feature Plots
# From file [SeuratClusterStats.envs.features_defaults] features = "file://path/to/custom_markers.txt" [SeuratClusterStats.envs.features."Custom markers"] plot_type = "violin" ident = "seurat_clusters" comparisons = true sig_label = "p.signif"
Pattern 4: Cluster Comparison Sankey
[SeuratClusterStats.envs.stats."Cluster flow by condition"]
plot_type = "sankey"
group_by = ["seurat_clusters", "Diagnosis"]
links_alpha = 0.6
devpars = {width = 800}
Pattern 5: Subclustering Visualization
[SeuratClusterStats.envs.dimplots."Subcluster UMAP"] group_by = "sub_clusters" reduction = "umap" # Uses sub_umap_<ident> automatically label = true
Dependencies
- •Upstream:
SeuratClustering,SeuratSubClustering(viaCombinedInput) - •Downstream: None (terminal visualization process)
- •Data: Seurat object with cluster assignments and optional subclustering
Validation Rules
- •Feature names: Must match gene symbols or metadata columns in Seurat object
- •Reduction names: Must exist in Seurat object (umap, tsne, pca, or sub_umap_<ident>)
- •Plot types: Must be valid plotthis plot types
- •Metadata columns: Must exist in
@meta.dataslot - •Device parameters: Positive integers required for width/height/res
Troubleshooting
Plot generation errors
- •"Feature not found": Check gene symbols match case sensitivity (human: UPPERCASE, mouse: TitleCase)
- •"Reduction not found": Verify reduction name in
Reducuctions(srtobj)object - •Empty plots: Check if
subsetexpression filters out all cells - •Slow rendering: Use
cache = truefor feature plots, reducehex_binsor downsample
Visual quality issues
- •Overcrowded labels: Use
label_repel = trueor reduce number of clusters - •Poor color contrast: Set custom
paletteparameter - •Incorrect orientation: Use
flip = trueto transpose plot - •Missing annotations: Verify
column_annotationcolumns exist in metadata
Missing subcluster UMAP
- •If subclustering exists but
sub_umap_<ident>not found, process uses standard UMAP - •To force subcluster visualization: Run
RunUMAP()on subcluster level or specifyreduction = "umap"
Large dataset performance
- •Enable
hex = truefor dimplots with >10,000 cells - •Use
downsampleparameter in feature plots - •Set
cache = trueto avoid re-rendering expensive plots
Output Structure
<srtobj_stem>.cluster_stats/ ├── clustrees/ # Clustree plots (png + pdf) ├── stats/ # Cell count/statistics plots ├── ngenes/ # Gene count plots ├── features/ # Gene expression visualizations └── dimplots/ # Dimension reduction plots
Each subdirectory contains plots for each configured case in the process environment.