假设我有两个已排序的大文件:
档案A:
1
1
2
3
5
...
档案B:
2
2
2
4
8
...
当我在内存中加载任何文件时,它会导致崩溃,也许是内存不足的问题。我在考虑如何加载两个文件,以合并和排序到一个文件。我该怎么办?
大家好!
这是我最初的想法:
def read_lines(filepath):
with open(filepath, 'r') as f:
cxt = f.read()
lines = cxt.split('\n')
return lines
a = read_lines('a.txt')
b = read_lines('b.txt')
c = a + b
c.sort()
with open('c.txt', 'w') as f:
lines = '\n'.join(c)
f.write(lines)
答案 0 :(得分:5)
由于两个文件都已排序(并且大于0),您只需合并它们即可。就在我的头顶而没有经过测试:
with open('a.txt') as fa, open('b.txt') as fb, open('new.txt', 'w') as fn:
line_a, line_b = int(next(fa, 0)), int(next(fb, 0))
while line_a or line_b:
if not line_b or (line_a and line_a < line_b):
fn.write("{}\n".format(line_a))
line_a = int(next(fa, 0))
else:
fn.write("{}\n".format(line_b))
line_b = int(next(fb, 0))
答案 1 :(得分:0)
如果我的任何文件包含负数,@ AChampion的答案就不适合解决它。
所以我有另一个答案。它不仅解决了我的原始问题,而且还包含负数。
def nexline(f):
return float(next(f, '-inf'))
def is_end(line):
return line == float('-inf')
def write_new_line(new_f, source_f, line):
new_f.write("{}\n".format(line))
source_line = nexline(source_f)
return source_line
with open('a.txt') as fa, open('b.txt') as fb, open('new.txt', 'w') as fn:
line_a, line_b = nexline(fa), nexline(fb)
while not is_end(line_a) or not is_end(line_b):
if not is_end(line_a) and line_a < line_b:
line_a = write_new_line(fn, fa, line_a)
else:
line_b = write_new_line(fn, fb, line_b)