VCF manipulation with GATK

VCF manipulation with GATK

原文日期: 2017-09-07
来源: https://github.com/wlz0726/wlz0726.github.io


GATK VCF 操作

选择变异

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# 选择 SNP
gatk SelectVariants \
-V input.vcf \
--select-type-to-include SNP \
-O snps.vcf

# 选择 INDEL
gatk SelectVariants \
-V input.vcf \
--select-type-to-include INDEL \
-O indels.vcf

# 选择特定样本
gatk SelectVariants \
-V input.vcf \
-sn sample1 \
-O sample1.vcf

合并 VCF

1
2
3
4
gatk MergeVcfs \
-I input1.vcf \
-I input2.vcf \
-O merged.vcf

拆分 VCF

1
2
3
4
5
6
7
8
# 按染色体拆分
for chr in {1..22} X Y; do
gatk SelectVariants \
-V input.vcf \
-R reference.fasta \
-L $chr \
-O chr${chr}.vcf
done

此文档为 GitHub 博客自动归档