如何在Linux中使用具有多个变量的for循环

时间:2019-01-28 14:08:42

标签: arrays linux bash for-loop multiple-files

我有一个命令需要针对多个文件组合运行。该命令如下所示:

myscript.pl -output_directory /path/output_"$TARGET_SAMPLE"_vs_"$NORMAL_SAMPLE" -target_sample /path/$TARGET_SAMPLE.bam -normal_sample /path/$NORMAL_SAMPLE.bam

我想对多组样本运行此命令,而不必每次都手动更改路径。现在,我在手动运行之前设置示例:

export TARGET_SAMPLE="sample_1"
export NORMAL_SAMPLE="sample_2"

如何运行此命令以确保TARGET_SAMPLE和NORMAL_SAMPLE始终正确匹配?对于每个NORMAL_SAMPLE,我需要使用两个不同的TARGET_SAMPLE文件运行两次脚本。我认为使用数组可能有效,但我不知道如何正确地将其馈入for循环。

以下是我需要进行配对的一些示例:

export TARGET_SAMPLE="sample_1"
export NORMAL_SAMPLE="sample_2"

export TARGET_SAMPLE="sample_3"
export NORMAL_SAMPLE="sample_2"

export TARGET_SAMPLE="sample_4"
export NORMAL_SAMPLE="sample_5"

export TARGET_SAMPLE="sample_6"
export NORMAL_SAMPLE="sample_5"

因此,此组合列表的第一个示例输出将是在shell中提交以下命令:

myscript.pl -output_directory /path/output_sample_1_vs_sample_2 -target_sample /path/sample_1.bam -normal_sample /path/sample_2.bam

第二个是:

myscript.pl -output_directory /path/output_sample_3_vs_sample_2 -target_sample /path/sample_3.bam -normal_sample /path/sample_2.bam

感谢您的帮助。

1 个答案:

答案 0 :(得分:3)

方法1使用while循环从“此处文档”中读取多个值:

export TARGET_SAMPLE NORMAL_SAMPLE

# special characters in the values (eg. space) will cause problems
while read TARGET_SAMPLE NORMAL_SAMPLE ANYTHING_ELSE; do
    # insert sanity checks here
    myscript.pl -output_directory /path/output_"$TARGET_SAMPLE"_vs_"$NORMAL_SAMPLE" -target_sample /path/$TARGET_SAMPLE.bam -normal_sample /path/$NORMAL_SAMPLE.bam
done <<'EOD'
sample_1 sample_2
sample_3 sample_2
sample_4 sample_5
sample_6 sample_5
EOD

方法1b与方法1相同,但从外部文件读取数据:

# spcial characters in the values (eg. space) will cause problems
cat >mydata <<'EOD'
sample_1 sample_2
sample_3 sample_2
sample_4 sample_5
sample_6 sample_5
EOD

export TARGET_SAMPLE NORMAL_SAMPLE

# normally $ANYTHING_ELSE should be empty but embedded spaces will confuse read
cat mydata | while read TARGET_SAMPLE NORMAL_SAMPLE ANYTHING_ELSE; do
    # insert sanity checks here
    myscript.pl -output_directory /path/output_"$TARGET_SAMPLE"_vs_"$NORMAL_SAMPLE" -target_sample /path/$TARGET_SAMPLE.bam -normal_sample /path/$NORMAL_SAMPLE.bam
done

方法2使用shell函数包装:

export TARGET_SAMPLE NORMAL_SAMPLE

wrapper(){
    TARGET_SAMPLE=$1
    NORMAL_SAMPLE=$2
    # insert sanity checks here
    myscript.pl -output_directory /path/output_"$TARGET_SAMPLE"_vs_"$NORMAL_SAMPLE" -target_sample /path/$TARGET_SAMPLE.bam -normal_sample /path/$NORMAL_SAMPLE.bam
}

wrapper "sample_1" "sample_2"
wrapper "sample_3" "sample_2"
wrapper "sample_4" "sample_5"
wrapper "sample_6" "sample_5"

方法3使用for循环访问多个数组:

Bash具有索引数组变量,因此可能会发生for循环,但保持数组同步很容易出错,因此我不建议这样做。