我已经筋疲力尽我的大脑试图做一些对Perl编程经验比我更有经验的人。
我有以下代码。
use strict;
use warnings;
my @lines = do {
open my $in_fh, '<', 'input.txt' or die qq{Unable to open "input.txt" for input: $!};
<$in_fh>;
};
chomp @lines;
my $re = join '|', @lines;
my @files = grep /^(?:$re)/, glob '*.bam';
$_ = "INPUT=$_" for @files;
foreach my $file (@files) {
foreach my $line (@lines) {
if ($file =~ m/$line/) {
my $command = "picard MergeSamFiles $file OUTPUT=$line" . "-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE";
system($command);
my $command2 = "picard MarkDuplicates $line OUTPUT=$line-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE";
system($command2);
unlink "$line-tmp-herc2.bam";
unlink "$line-tmp-herc2.bai";
unlink "tmp";
}
}
}
在input.txt中,我有样本名称,用于验证样本是否在目录中。在这个例子中,我只使用了两个样本。
HG00096
HG00117
所以,通过上面的代码,我得到了类似的东西。
picard MergeSamFiles INPUT=HG00096.mapped.ILLUMINA.bwa.GBR.exome.20120522.bam_herc2_data.bam OUTPUT=HG00096-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00096 OUTPUT=HG00096-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE
picard MergeSamFiles INPUT=HG00096.mapped.ILLUMINA.bwa.GBR.low_coverage.20101123.bam_herc2_phase1.bam OUTPUT=HG00096-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00096 OUTPUT=HG00096-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE
picard MergeSamFiles INPUT=HG00096.mapped.ILLUMINA.bwa.GBR.low_coverage.20120522.bam_herc2_data.bam OUTPUT=HG00096-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00096 OUTPUT=HG00096-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE
picard MergeSamFiles INPUT=HG00096.mapped.illumina.mosaik.GBR.exome.20110411.bam_herc2_phase1.bam OUTPUT=HG00096-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00096 OUTPUT=HG00096-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE
picard MergeSamFiles INPUT=HG00117.mapped.ILLUMINA.bwa.GBR.exome.20120522.bam_herc2_data.bam OUTPUT=HG00117-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00117 OUTPUT=HG00117-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE
picard MergeSamFiles INPUT=HG00117.mapped.ILLUMINA.bwa.GBR.low_coverage.20101123.bam_herc2_phase1.bam OUTPUT=HG00117-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00117 OUTPUT=HG00117-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE
picard MergeSamFiles INPUT=HG00117.mapped.ILLUMINA.bwa.GBR.low_coverage.20120522.bam_herc2_data.bam OUTPUT=HG00117-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00117 OUTPUT=HG00117-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE
picard MergeSamFiles INPUT=HG00117.mapped.illumina.mosaik.GBR.exome.20110411.bam_herc2_phase1.bam OUTPUT=HG00117-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00117 OUTPUT=HG00117-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE
当我真的想要这样的东西时。
picard MergeSamFiles INPUT=HG00096.mapped.ILLUMINA.bwa.GBR.exome.20120522.bam_herc2_data.bam INPUT=HG00096.mapped.ILLUMINA.bwa.GBR.low_coverage.20101123.bam_herc2_phase1.bam INPUT=HG00096.mapped.ILLUMINA.bwa.GBR.low_coverage.20120522.bam_herc2_data.bam INPUT=HG00096.mapped.illumina.mosaik.GBR.exome.20110411.bam_herc2_phase1.bam OUTPUT=HG00096-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00096-tmp-herc2.bam OUTPUT=HG00096-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE
picard MergeSamFiles INPUT=HG00117.mapped.ILLUMINA.bwa.GBR.exome.20120522.bam_herc2_data.bam INPUT=HG00117.mapped.ILLUMINA.bwa.GBR.low_coverage.20101123.bam_herc2_phase1.bam INPUT=HG00117.mapped.ILLUMINA.bwa.GBR.low_coverage.20120522.bam_herc2_data.bam INPUT=HG00117.mapped.illumina.mosaik.GBR.exome.20110411.bam_herc2_phase1.bam OUTPUT=HG00117-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00117-tmp-herc2.bam OUTPUT=HG00117-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE
因此,INPUT
数据应该在一起,以便系统command
合并文件,为下一个来源OUTPUT
生成command2
。
我知道我正在弄乱foreach循环,但我试图弄清楚如何正确地迭代这个并且我卡住了。
希望你能帮我解决这个问题。
答案 0 :(得分:0)
在第一个命令中,为OUTPUT文件添加后缀:
my $command = "picard MergeSamFiles $file OUTPUT=$line" . "-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE";
# here ___^_______________^
对第二个命令执行相同的操作:
my $command2 = "picard MarkDuplicates ${line}-tmp-herc2.bam OUTPUT=$line-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE";
# here ___^____________^