如何按字母顺序在Python中连接多个排序文件?

时间:2014-10-07 15:56:26

标签: python python-2.7 csv

如何逐行读取多个CSV输入文件,比较每行中的字符,将第一行按字母顺序写入输出文件,然后前进最小值文件的指针以继续与所有文件进行比较,直到到达所有输入文件的末尾。这里有一些粗略的解决方案。

buffer = []

for inFile in inFiles:

    f = open(inFile, "r")
    line = f.next()
    buffer.append([line, inFile])

#find minimum value in buffer alphabetically...
#write it to an output file...

#how do I advance one line in the file with the min value?
#and then continue the line-by-line comparisons in input files?

1 个答案:

答案 0 :(得分:5)

您可以使用heapq.merge

import heapq
import contextlib

files = [open(fn) for fn in inFiles]
with contextlib.nested(*files):
    with open('output', 'w') as f:
        f.writelines(heapq.merge(*files))

在Python 3.x(3.3 +)中:

import heapq
import contextlib

with contextlib.ExitStack() as stack:
    files = [stack.enter_context(open(fn)) for fn in inFiles]
    with open('output', 'w') as f:
        f.writelines(heapq.merge(*files))