如何通过'for'命令传递多个输入文件

时间:2015-03-13 20:26:39

标签: bash

我有多个样本,R1和R2以fastq.gz格式读取(这些文件相互补充)我想在所有文件上运行BWA mem配对端并行完成每个R1和R2补充文件应生成一个山姆文件。现在我正在从两个读取中创建两个sam文件

这是我想出的,但它不是我需要做的事情

for i in `find -maxdepth 2 -iname *fastq.gz -type f`; do
   echo "bwa mem -t 12 /H.Sapiens/ucsc.hg19.fasta  ${i}_R1_001.fastq.gz  ${i}_R2_001.fastq.gz > ${i}_R1_R2.sam"
done

运行时它看起来像这样

bwa mem -t 12 /H.Sapiens/ucsc.hg19.fasta  ./Sample_0747/0747_CGG_L001_R2_001.fastq.gz_R1_001.fastq.gz ./Sample_0747/0747_CGG_L001_R2_001.fastq.gz_R2_001.fastq.gz > ./Sample_0747/0747_CGG_L001_R2_001.fastq.gz_R1_R2.sam

bwa mem -t 12 H.Sapiens/ucsc.hg19.fasta  ./Sample_0748/0748_CCA_L001_R1_001.fastq.gz_R1_001.fastq.gz ./Sample_0748/0748_CCA_L001_R1_001.fastq.gz_R2_001.fastq.gz > ./Sample_0748/0748_CCA_L001_R1_001.fastq.gz_R1_R2.sam
-bash-4.1$

我理解问题在于iname但我该如何修复? 非常感谢你

2 个答案:

答案 0 :(得分:1)

尝试

find -maxdepth 2 -iname \*fastq.gz -type f |
sed 's/_R[12]_001\.fastq\.gz$//' |
sort -u | 
while IFS= read -r f; do
   echo "bwa mem -t 12 /H.Sapiens/ucsc.hg19.fasta \"${f}_R1_001.fastq.gz\"  \"${f}_R2_001.fastq.gz\" > \"${f}_R1_R2.sam\""
done

答案 1 :(得分:1)

Don't loop over a value parsed like that *。首先,为了理智而将代码放在脚本中,例如

cat > script < SCRIPT
  for i; do
    bwa mem -t 12 /H.Sapiens/ucsc.hg19.fasta "${i}_R"{1,2}_001.fastq.gz > "${i}_R1_R2.sam"
  done
SCRIPT
chmod +x script

然后,使用-exec谓词或xargs,例如

find -maxdepth 2 -iname '*fastq.gz' -type f -exec ./script {} +

find -maxdepth 2 -iname '*fastq.gz' -type f -print0 | xargs -0 ./script

*它说&#34;解析ls&#34;,但它适用于解析任何供人类消费的命令。明确地呼吁find


另一方面,如果你不在find的参数旁边加引号,那么shell可能会将它们解释为globs。

find -iname *fastq.gz

可以扩展到

find -iname foofastq.gz barfastq.gz bazfastq.gz

你想要

find -iname '*fastq.gz'