如何合并两个文件并排序到一个文件?

时间:2016-03-10 02:26:05

标签: python file

假设我有两个已排序的大文件:

档案A:

1
1
2
3
5
...

档案B:

2
2
2
4
8
...

当我在内存中加载任何文件时,它会导致崩溃,也许是内存不足的问题。我在考虑如何加载两个文件,以合并和排序到一个文件。我该怎么办?

大家好!

这是我最初的想法:

def read_lines(filepath):
    with open(filepath, 'r') as f:
        cxt = f.read()
        lines = cxt.split('\n')
        return lines

a = read_lines('a.txt')
b = read_lines('b.txt')
c = a + b
c.sort()

with open('c.txt', 'w') as f:
    lines = '\n'.join(c)
    f.write(lines)

2 个答案:

答案 0 :(得分:5)

由于两个文件都已排序(并且大于0),您只需合并它们即可。就在我的头顶而没有经过测试:

with open('a.txt') as fa, open('b.txt') as fb, open('new.txt', 'w') as fn:
    line_a, line_b = int(next(fa, 0)), int(next(fb, 0))
    while line_a or line_b:
        if not line_b or (line_a and line_a < line_b):
            fn.write("{}\n".format(line_a))
            line_a = int(next(fa, 0))
        else:
            fn.write("{}\n".format(line_b))
            line_b = int(next(fb, 0))

答案 1 :(得分:0)

如果我的任何文件包含负数,@ AChampion的答案就不适合解决它。

所以我有另一个答案。它不仅解决了我的原始问题,而且还包含负数。

def nexline(f):
    return float(next(f, '-inf'))

def is_end(line):
    return line == float('-inf')

def write_new_line(new_f, source_f, line):
    new_f.write("{}\n".format(line))
    source_line = nexline(source_f)
    return source_line

with open('a.txt') as fa, open('b.txt') as fb, open('new.txt', 'w') as fn:
    line_a, line_b = nexline(fa), nexline(fb)
    while not is_end(line_a) or not is_end(line_b):
        if not is_end(line_a) and line_a < line_b:
            line_a = write_new_line(fn, fa, line_a)
        else:
            line_b = write_new_line(fn, fb, line_b)