AgentSkillsCN

BAM-filtration

执行数据清洗与移除操作。此技能接收原始BAM文件,通过主动移除伪影——包括线粒体读段、黑名单区域、PCR重复读段,以及未映射读段——生成全新的“干净”BAM文件。适用于从数据集中“清理”、“过滤”或“移除不良读段”。这是峰位识别之前的必要步骤。若您只想查看统计信息而无需修改文件,请勿使用此技能。

SKILL.md
--- frontmatter
name: BAM-filtration
description: Performs data cleaning and removal operations. This skill takes a raw BAM and creates a new, "clean" BAM file by actively removing artifacts: mitochondrial reads, blacklisted regions, PCR duplicates, and unmapped reads. Use this skill to "clean," "filter," or "remove bad reads" from a dataset. This is a prerequisite step before peak calling. Do NOT use this skill if you only want to view statistics without modifying the file.

BAM Filtration for ChIP-seq / ATAC-seq

Overview

Main steps include:

  • Check the availability of blacklist file in current directory. Always prompt user whether to filter blacklist if blacklist files are missing. if the user need to filter blacklist file, then prompt user for the path of blacklist file.
  • Initialize the project directory and create the required directory.
  • Refer to the Inputs & Outputs section to check inputs and build the output architecture. All the output file should located in ${proj_dir} in Step 0.
  • Discover input BAMs in the current directory (or those matching a target token), and only select BAMs that are already coordinate-sorted and contain read group (RG) information.
  • Perform the filtration task with tools.

When to use this skill

  • Use this skill to "clean," "filter," or "remove bad reads" from a dataset
  • This is a prerequisite step before peak calling.
  • Do NOT use this skill if you only want to view statistics without modifying the file.

Inputs & Outputs

Inputs

bash
${sample}.bam # BAMs that are already coordinate-sorted and contain read group (RG) information

Outputs

bash
all_bam_filtration/
  filtered_bam/
    ${sample}.filtered.bam
    ${sample}.filtered.bam.bai
  temp/
    ... # intermediate files

Decision Tree

Step 0: Initialize Project

Call:

  • mcp__project-init-tools__project_init

with:

  • sample: all
  • task: bam_filtration

The tool will:

  • Create ${sample}_bam_filtration directory.
  • Return the full path of the ${sample}_bam_filtration directory, which will be used as ${proj_dir}.

Step 1: Filter BAM files

Call:

  • mcp__qc-tools__bam_artifacts

with:

  • bam_file: BAMs that are already coordinate-sorted and contain read group (RG) information
  • output_bam: ${proj_dir}/filtered_bam/${sample}.filtered.bam
  • temp_dir: ${proj_dir}/temp/
  • blacklist_bed: Path of the blacklist file