Seurat-Local Skill
Comprehensive assistance with Seurat v5 for single-cell RNA-seq analysis, generated from official documentation.
When to Use This Skill
This skill should be triggered when:
Core Seurat Analysis:
- •Working with single-cell RNA-seq data analysis in R
- •Setting up Seurat objects and performing quality control
- •Normalizing data and finding variable features
- •Running dimensional reduction (PCA, UMAP, t-SNE)
- •Performing clustering and cell type identification
Data Integration:
- •Integrating multiple single-cell datasets
- •Using FindIntegrationAnchors and IntegrateData workflows
- •Performing batch correction across experiments
- •Working with both CCA and RPCA integration methods
- •Transferring cell type annotations between datasets
Advanced Analysis:
- •Differential expression testing and marker gene identification
- •Spatial transcriptomics analysis with 10x Visium data
- •Trajectory inference and lineage analysis
- •Multi-modal data integration (e.g., scRNA-seq + scATAC-seq)
- •Pseudobulk analysis and aggregated expression calculations
Seurat v5 Specific Features:
- •Working with the new layered assay structure
- •Using SCTransform v2 for normalization
- •Performing integration in low-dimensional space
- •Using the streamlined IntegrateLayers workflow
- •Working with on-disk matrices for large datasets
Common Use Cases:
- •"How do I integrate two scRNA-seq datasets?"
- •"Help me find marker genes for my clusters"
- •"How do I normalize my single-cell data?"
- •"What's the difference between CCA and RPCA integration?"
- •"How do I analyze spatial transcriptomics data?"
Quick Reference
Essential Seurat Workflow
Basic Setup and Normalization
library(Seurat)
library(SeuratData)
# Load example dataset
InstallData("pbmc3k")
pbmc <- LoadData("pbmc3k")
# Basic preprocessing
pbmc <- NormalizeData(pbmc, verbose = FALSE)
pbmc <- FindVariableFeatures(pbmc, selection.method = "vst", nfeatures = 2000)
Dimensional Reduction and Clustering
# Run PCA and determine dimensions pbmc <- RunPCA(pbmc, verbose = FALSE) pbmc <- FindNeighbors(pbmc, dims = 1:10) pbmc <- FindClusters(pbmc, resolution = 0.5) # Run UMAP for visualization pbmc <- RunUMAP(pbmc, dims = 1:10) DimPlot(pbmc, reduction = "umap")
Differential Expression
# Find markers for cluster 1 cluster1.markers <- FindMarkers(pbmc, ident.1 = 1, min.pct = 0.25) head(cluster1.markers) # Find all markers pbmc.markers <- FindAllMarkers(pbmc, only.pos = TRUE, min.pct = 0.25)
Data Integration Workflows
Standard Integration (CCA-based)
# Split dataset by condition
ifnb.list <- SplitObject(ifnb, split.by = "stim")
# Normalize and find variable features
ifnb.list <- lapply(ifnb.list, function(x) {
x <- NormalizeData(x)
x <- FindVariableFeatures(x, selection.method = "vst", nfeatures = 2000)
})
# Find integration anchors
immune.anchors <- FindIntegrationAnchors(object.list = ifnb.list, dims = 1:20)
# Integrate data
immune.combined <- IntegrateData(anchorset = immune.anchors, dims = 1:20)
Fast Integration (RPCA-based)
# Preprocess with PCA individually
features <- SelectIntegrationFeatures(object.list = ifnb.list)
ifnb.list <- lapply(ifnb.list, function(x) {
x <- ScaleData(x, features = features, verbose = FALSE)
x <- RunPCA(x, features = features, verbose = FALSE)
})
# Find anchors using RPCA (faster, more conservative)
immune.anchors <- FindIntegrationAnchors(
object.list = ifnb.list,
anchor.features = features,
reduction = "rpca"
)
SCTransform Integration
# Normalize with SCTransform
ifnb.list <- lapply(ifnb.list, SCTransform, method = "glmGamPoi")
features <- SelectIntegrationFeatures(object.list = ifnb.list, nfeatures = 3000)
# Prepare for integration
ifnb.list <- PrepSCTIntegration(object.list = ifnb.list, anchor.features = features)
ifnb.list <- lapply(ifnb.list, RunPCA, features = features)
# Integrate with SCTransform
immune.anchors <- FindIntegrationAnchors(
object.list = ifnb.list,
normalization.method = "SCT",
anchor.features = features,
reduction = "rpca"
)
immune.combined.sct <- IntegrateData(
anchorset = immune.anchors,
normalization.method = "SCT"
)
Spatial Transcriptomics
Load and Process 10x Visium Data
library(Seurat)
library(SeuratData)
library(ggplot2)
# Load spatial dataset
InstallData("stxBrain")
brain <- LoadData("stxBrain", type = "anterior1")
# Normalize with SCTransform (recommended for spatial data)
brain <- SCTransform(brain, assay = "Spatial", verbose = FALSE)
# Run dimensional reduction and clustering
brain <- RunPCA(brain, assay = "SCT", verbose = FALSE)
brain <- FindNeighbors(brain, reduction = "pca", dims = 1:30)
brain <- FindClusters(brain, verbose = FALSE)
brain <- RunUMAP(brain, reduction = "pca", dims = 1:30)
Spatial Visualization
# Visualize clusters on tissue
SpatialDimPlot(brain, label = TRUE, label.size = 3)
# Visualize gene expression
SpatialFeaturePlot(brain, features = c("Hpca", "Ttr"))
# Adjust visualization parameters
SpatialFeaturePlot(brain, features = "Ttr", alpha = c(0.1, 1))
Cell Type Label Transfer
# Load single-cell reference
allen_reference <- readRDS("allen_cortex.rds")
allen_reference <- SCTransform(allen_reference, ncells = 3000, verbose = FALSE)
# Find transfer anchors
anchors <- FindTransferAnchors(
reference = allen_reference,
query = cortex,
normalization.method = "SCT"
)
# Transfer cell type labels
predictions.assay <- TransferData(
anchorset = anchors,
refdata = allen_reference$subclass,
prediction.assay = TRUE
)
cortex[["predictions"]] <- predictions.assay
Advanced Features
Module Scoring
# Define gene sets
cd_features <- list(c('CD79B', 'CD79A', 'CD19', 'CD180', 'CD200',
'CD3D', 'CD2', 'CD3E', 'CD7', 'CD8A'))
# Calculate module scores
pbmc <- AddModuleScore(
object = pbmc,
features = cd_features,
name = 'CD_Features'
)
Project UMAP Coordinates
# Project new data onto existing UMAP
query_umap <- ProjectUMAP(
query = new_data,
reference = reference_data,
query.reduction = "pca",
reference.reduction = "pca",
reduction.model = reference_data[["umap"]]
)
Integrate Layers (Seurat v5)
# Integrate multiple layers in a single object
integrated_object <- IntegrateLayers(
object = seurat_object,
method = RPCAIntegration,
orig.reduction = "pca",
assay = "RNA",
features = variable_features
)
Key Concepts
Core Seurat Objects
- •Seurat Object: Main data structure containing assays, metadata, and reductions
- •Assay: Contains different data layers (counts, data, scale.data)
- •Layers: In Seurat v5, data is stored in layers instead of slots
- •DimReduc: Dimensional reduction objects (PCA, UMAP, etc.)
Data Integration Methods
- •CCA (Canonical Correlation Analysis): Traditional method, good for strong batch effects
- •RPCA (Reciprocal PCA): Faster, more conservative, less over-correction
- •SCTransform: Regularized negative binomial normalization, recommended for most analyses
Spatial Analysis
- •Spots: Spatial measurement locations (50μm for 10x Visium)
- •Images: Tissue histology images stored in object
- •Coordinate Systems: Mapping between spots and image coordinates
Reference Files
This skill includes comprehensive documentation in references/:
announcements.md
- •Seurat v5 release notes and changes
- •Backwards compatibility information
- •New feature descriptions and migration guide
api.md
- •Complete function reference with parameters
- •Detailed examples for each function
- •Performance notes and best practices
- •Functions covered:
- •
ProjectUMAP()- Query dataset projection - •
IntegrateLayers()- Multi-layer integration - •
FastRowScale()- Efficient matrix operations - •
AddModuleScore()- Gene set scoring - •
CellSelector()- Interactive cell selection - •
TransferData()- Cross-dataset data transfer
- •
other.md
- •Code of conduct and community guidelines
- •Advanced tutorials and case studies
- •Integration workflows for complex scenarios
- •Comprehensive PBMC stimulation analysis example
tutorials.md
- •Step-by-step guides for common workflows
- •RPCA vs CCA integration comparison
- •Spatial transcriptomics analysis tutorials
- •SCTransform normalization workflows
- •Performance optimization tips
Working with This Skill
For Beginners
- •Start with the basic workflow: Load data → Normalize → Find variable features → PCA → Cluster → UMAP
- •Use the getting_started reference for fundamental concepts
- •Follow the basic examples in the Quick Reference section
- •Practice with the pbmc3k dataset which is included in SeuratData
For Intermediate Users
- •Explore integration methods when working with multiple datasets
- •Use spatial analysis features for spatial transcriptomics data
- •Leverage the API reference for advanced function parameters
- •Study the PBMC stimulation tutorial for comparative analysis
For Advanced Users
- •Use the tutorials reference for complex workflows
- •Optimize performance with RPCA integration and SCTransform
- •Implement custom analysis pipelines using the function reference
- •Contribute to the community following the code of conduct
Navigation Tips
- •Search for specific functions in the api.md reference
- •Find complete workflows in tutorials.md
- •Check announcements.md for the latest Seurat v5 features
- •Use other.md for specialized use cases and community guidelines
Resources
references/
Organized documentation extracted from official sources containing:
- •Detailed function explanations with all parameters
- •Real code examples with proper syntax highlighting
- •Performance notes and best practices
- •Links to original documentation for further reading
scripts/
Add helper scripts here for common automation tasks such as:
- •Batch processing of multiple datasets
- •Custom visualization functions
- •Quality control automation
- •Report generation
assets/
Store templates, boilerplate, and example projects:
- •Example Seurat objects for testing
- •Custom gene sets for module scoring
- •Reference datasets for integration testing
- •Visualization themes and templates
Notes
- •Seurat v5 is now the default on CRAN with backwards compatibility
- •New assay structure uses layers instead of slots for better data organization
- •SCTransform v2 includes regularization improvements and glmGamPoi support
- •Integration workflows are now streamlined and more memory-efficient
- •Spatial analysis is fully integrated with enhanced visualization options
Updating
To refresh this skill with updated documentation:
- •Re-run the documentation scraper with the same configuration
- •The skill will be rebuilt with the latest Seurat documentation
- •All examples and references will be updated automatically
- •Quick reference patterns will be refreshed with new best practices