python subprocess popen:管道stdout弄乱了字符串

时间:2016-02-08 22:32:12

标签: python subprocess python-3.4

我正在尝试将几个文件连接在一起并添加标题。

import subprocess
outpath = "output.tab"
with open( outpath, "w" ) as outf :
        "write a header"
        if header is True:
            p1 = subprocess.Popen(["head", "-n1", files[-1] ], stdout= outf, )
        if type(header) is str:
            p1 = subprocess.Popen(["head", "-n1", header ], stdout= outf,)
        for fl in files:
            print(  fl )
            p1 = subprocess.Popen(["tail", "-n+2", fl], stdout= outf, )

由于某种原因,某些文件(fl)仅部分打印,而下一个文件在前一个文件的字符串中开始:

 awk '{print NF}' output.tab | uniq -c
    108 11
      1 14
     69 11
      1 10
     35 11
      1 16
    250 11
      1 16

有没有办法在Python中修复它?

混乱线条的一个例子:

$tail -n+108 output.tab | head -n1

CENPA   chr2    27008881.0  2701ABCD3   chr1    94883932.0  94944260.0  0.0316227766017 0.260698861451  0.277741584016  0.302602378581  0.4352790705329718  56  16


$grep -n A1 'CENPA' file1.tab

109:CENPA   chr2    27008881.0  27017455.0  1.0 0.417081004817  0.0829327365256 0.545205239241  0.7196619496326693  95  3
110-CENPO   chr2    25016174.0  25045245.0  1000.0  0.151090930896  -0.0083671250883    0.50882773122   0.0876177652747541  82  0


$grep -n 'ABCD3' file2.tab
2:ABCD3 chr1    94883932.0  94944260.0  0.0316227766017 0.260698861451  0.277741584016  0.302602378581  0.4352790705329718  56  16

1 个答案:

答案 0 :(得分:1)

我认为这里的问题是subprocess.Popen()默认情况下是异步运行的,而您似乎希望它同步运行。实际上,所有headtail命令都在同时运行,并指向输出文件。

要解决此问题,您可能只想添加.wait()

import subprocess
outpath = "output.tab"
with open( outpath, "w" ) as outf :
    "write a header"
    if header is True:
        p1 = subprocess.Popen(["head", "-n1", files[-1] ], stdout= outf, )
        p1.wait()  # Pauses the script until the command finishes
    if type(header) is str:
        p1 = subprocess.Popen(["head", "-n1", header ], stdout= outf,)
        p1.wait()
    for fl in files:
        print(  fl )
        p1 = subprocess.Popen(["tail", "-n+2", fl], stdout= outf, )
        p1.wait()