将4行组合成一个列表元素

时间:2017-03-15 03:11:49

标签: python-3.x loops

我有一个fastq文件,其中每个条目'是4行(两行在' +'之前)。我如何将每组4行读入单个列表元素?

该文件如下:

@DQNZZQ1:756:C3K7PACXX:6:1101:2383:2061 1:N:0:CCGTCC
GAACCCCACTGTGCACCACCTGTCTCTTATACACATCTAGATGTGTATAAGAGACAGAGATGGGGGCGACGACATTTTTGCAGCTGATGCTAAACGCGGA
+
@@CFFFFFHHHDHJJJIIJJJIHHGGGG<E@C9CDFHG>ABFGGADFHGIGEHCHHGEEC:GHGEH/8=?@99554>CC5CDCCDD=CD44>C@>@@DD@
@DQNZZQ1:756:C3K7PACXX:6:1101:2486:2062 1:N:0:CCGTCC
GCCCAAGACGGCCCCCGCTCCGCGTCGGTTCATCGGTTCCTCGGGGCAAGGATGTTCCCAGGTTGTTTGTGAGGAGAGTGTCTCTTTTTCACATCTTGTG
+
@@@DDDDDFFFFFIIIE8?FG)6@############################################################################
@DQNZZQ1:756:C3K7PACXX:6:1101:2359:2093 1:N:0:CCGTCC
TAAGATATTGGCAAGCAATATAGCTTTCTTCACGCGCCACACAGTTTCCCGGCTGTAGCGGTGACGACGGGGCAGACGGTGGAGGTGTTTCCTGCAGACT
+
@@@?DDFBFHGFD<@GGHCEHFCDHIHGHIIIIIFGIIGEFHGFD@DHFHBEBHGAC3)-99>?ABBB=@&5>;5889B0<<???8848<@@########
@DQNZZQ1:756:C3K7PACXX:6:1101:2319:2168 1:N:0:CCGTCC
AAGTTTAATAAGCAAACCCTGGGAACTGCGACGGTCTTCGGCACTGTCTACAAATGACGCGTCACAGAAGACCTCTAAACCTCGATCCAGTTATCGCTGT
+
==@4:BDBDBB?8AFGHIEHHIII;F3?1?FF?F0????C@FA;DEEGHEC;?=CADCB=A/3'5:@A>?CCC:>@A:49?A<B5>??CCA>>+>18?##
@DQNZZQ1:756:C3K7PACXX:6:1101:2337:2170 1:N:0:CCGTCC
GGCGACTGTGTTTGCCAAGATGGAGCGCGACCTGCGGCGGCCGGGTGCCGTGTTTGCCGAGGCGGGCGCACCCGCCCGCTGGGAGACGGGCCCCAACTAG
+
;=?DD:::DFCCCFGIGIIGGIBCHIIIID@GHIIIBEB>B@B@-)5??B05?AC9>AB5<77@####################################

到目前为止我已经到了:

forward = open(sys.argv[1],'r')
reverse = open(sys.argv[2],'r')
output = open(sys.argv[3],'r')


for reads in forward:
    freads_full = islice(forward, 4)
    for line in freads_full:
        flist = line

谢谢你的帮助!

1 个答案:

答案 0 :(得分:0)

itertools docs中有一个名为grouper的食谱可以完全符合您的要求:

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)

您可以执行类似

的操作
with open(sys.argv[1]) as forward: 
    for batch in grouper(forward, 4): 
        # do stuff with the iterator 

您可以选择使用线条进行操作。例如,如果要连接它们,可以执行

''.join(batch) 

甚至

sum(batch)