我有一个命令需要针对多个文件组合运行。该命令如下所示:
myscript.pl -output_directory /path/output_"$TARGET_SAMPLE"_vs_"$NORMAL_SAMPLE" -target_sample /path/$TARGET_SAMPLE.bam -normal_sample /path/$NORMAL_SAMPLE.bam
我想对多组样本运行此命令,而不必每次都手动更改路径。现在,我在手动运行之前设置示例:
export TARGET_SAMPLE="sample_1"
export NORMAL_SAMPLE="sample_2"
如何运行此命令以确保TARGET_SAMPLE和NORMAL_SAMPLE始终正确匹配?对于每个NORMAL_SAMPLE,我需要使用两个不同的TARGET_SAMPLE文件运行两次脚本。我认为使用数组可能有效,但我不知道如何正确地将其馈入for循环。
以下是我需要进行配对的一些示例:
export TARGET_SAMPLE="sample_1"
export NORMAL_SAMPLE="sample_2"
export TARGET_SAMPLE="sample_3"
export NORMAL_SAMPLE="sample_2"
export TARGET_SAMPLE="sample_4"
export NORMAL_SAMPLE="sample_5"
export TARGET_SAMPLE="sample_6"
export NORMAL_SAMPLE="sample_5"
因此,此组合列表的第一个示例输出将是在shell中提交以下命令:
myscript.pl -output_directory /path/output_sample_1_vs_sample_2 -target_sample /path/sample_1.bam -normal_sample /path/sample_2.bam
第二个是:
myscript.pl -output_directory /path/output_sample_3_vs_sample_2 -target_sample /path/sample_3.bam -normal_sample /path/sample_2.bam
感谢您的帮助。
答案 0 :(得分:3)
方法1使用while循环从“此处文档”中读取多个值:
export TARGET_SAMPLE NORMAL_SAMPLE
# special characters in the values (eg. space) will cause problems
while read TARGET_SAMPLE NORMAL_SAMPLE ANYTHING_ELSE; do
# insert sanity checks here
myscript.pl -output_directory /path/output_"$TARGET_SAMPLE"_vs_"$NORMAL_SAMPLE" -target_sample /path/$TARGET_SAMPLE.bam -normal_sample /path/$NORMAL_SAMPLE.bam
done <<'EOD'
sample_1 sample_2
sample_3 sample_2
sample_4 sample_5
sample_6 sample_5
EOD
方法1b与方法1相同,但从外部文件读取数据:
# spcial characters in the values (eg. space) will cause problems
cat >mydata <<'EOD'
sample_1 sample_2
sample_3 sample_2
sample_4 sample_5
sample_6 sample_5
EOD
export TARGET_SAMPLE NORMAL_SAMPLE
# normally $ANYTHING_ELSE should be empty but embedded spaces will confuse read
cat mydata | while read TARGET_SAMPLE NORMAL_SAMPLE ANYTHING_ELSE; do
# insert sanity checks here
myscript.pl -output_directory /path/output_"$TARGET_SAMPLE"_vs_"$NORMAL_SAMPLE" -target_sample /path/$TARGET_SAMPLE.bam -normal_sample /path/$NORMAL_SAMPLE.bam
done
方法2使用shell函数包装:
export TARGET_SAMPLE NORMAL_SAMPLE
wrapper(){
TARGET_SAMPLE=$1
NORMAL_SAMPLE=$2
# insert sanity checks here
myscript.pl -output_directory /path/output_"$TARGET_SAMPLE"_vs_"$NORMAL_SAMPLE" -target_sample /path/$TARGET_SAMPLE.bam -normal_sample /path/$NORMAL_SAMPLE.bam
}
wrapper "sample_1" "sample_2"
wrapper "sample_3" "sample_2"
wrapper "sample_4" "sample_5"
wrapper "sample_6" "sample_5"
方法3使用for循环访问多个数组:
Bash具有索引数组变量,因此可能会发生for循环,但保持数组同步很容易出错,因此我不建议这样做。