python和shell for循环组合外部行为

时间:2016-07-15 10:57:00

标签: python bash for-loop

我有一个简单的python脚本来重命名我的数据:wchich已经为一个文件工作:

for line in Coord:
    coord = line.split()[0]
    miRname = line.split()[1]
    print miRname
    os.system('samtools view -h ' + BAMfile + ' '+ coord + ' >' + miRname) #extract the reads from a coordinate
    os.system("sed -i 's/HISEQ2500/" + miRname + "/g' " + miRname)  #'samtools view -H ' + miRname + ' >' + 'header.sam') #extract headers frm the bam
    os.system('samtools fastq ' + miRname + ' >>' + Treatment)
    os.system('rm ' + miRname )

现在我想使用shell解析一个文件夹(因为这就是我知道如何在这种情况下轻松使用):

MAPPINGS_DIR=/abc/defg/hij
for bam in $MAPPPINGS_DIR/mappings/tophat2.1.1/*; do
    name=$(basename $bam);
    echo $name
    python ~/Documents/scripts/miRNA_reads_prep.py MAPPINGS_DIR$bam/$name.bam name.fastq
done    

我遇到的问题是,这开始覆盖我的原始文件(bam->很高兴它的副本),同时检查这个for循环正在做什么: 正在抓取所有bam文件的所有列表的第二个参数,因此我的sys2是我的第二个bam文件(副本一个)

ENSG00000199075 ENSG00000273874 homo_sapiens_GRCh84_BC.sorted.miRNA.GTF Mock_24_1 Mock_24_1.fastq mature.fa mature.fa.gz miRBase.gff miRcoord miRNAseq.fa sedBtFKZ7 sedFFMTf1 sedHt4fBN sedIZagSr sedM9Wbvy sedout7Bz try
ENSG00000278267
samtools view -h /mappings/tophat2.1.1/Mock_24h_1/Mock_24h_1.bam 1:17369-17436 >ENSG00000278267
sed -i 's/HISEQ2500/ENSG00000278267/g' ENSG00000278267
samtools fastq ENSG00000278267 >>/mappings/tophat2.1.1/Mock_24h_1/Mock_24h_1.sorted.bam
rm ENSG00000278267
['/print/miRNA_reads_prep.py', '/mappings/tophat2.1.1/Mock_24h_1/Mock_24h_1.bam', '/mappings/tophat2.1.1/Mock_24h_1/Mock_24h_1.sorted.bam', '/mappings/tophat2.1.1/Mock_24h_1/unmapped.bam', '/mappings/tophat2.1.1/Mock_24h_2/Mock_24h_2.bam', '/mappings/tophat2.1.1/Mock_24h_2/Mock_24h_2.sorted.bam', '/mappings/tophat2.1.1/Mock_24h_2/unmapped.bam', '/mappings/tophat2.1.1/Mock_24h_3/Mock_24h_3.bam', '/mappings/tophat2.1.1/Mock_24h_3/Mock_24h_3.sorted.bam', '/mappings/tophat2.1.1/Mock_24h_3/unmapped.bam', '/mappings/tophat2.1.1/Mock_3h_1/Mock_3h_1.bam', '/mappings/tophat2.1.1/Mock_3h_1/Mock_3h_1.sorted.bam', '/mappings/tophat2.1.1/Mock_3h_1/unmapped.bam', '/mappings/tophat2.1.1/Mock_3h_2/Mock_3h_2.bam', '/mappings/tophat2.1.1/Mock_3h_2/Mock_3h_2.sorted.bam', '/mappings/tophat2.1.1/Mock_3h_2/unmapped.bam',  'Mock_24_1.fastq']

我想知道如何在我的for循环中纠正这个问题? 我应该为每个内部文件夹执行第二次for循环吗?我认为在for循环的每一轮都会检查每个参数。

1 个答案:

答案 0 :(得分:0)

有一个拼写错误bam in $MAPPPINGS_DIR三个P而不是两个...... - sp asic