Question

我有两种不同语言的文本文件，它们是逐行对齐的。即textfile1中的第一行对应于textfile2中的第一行，依此类推。依次类推。

有没有办法同时逐行读取这两个文件？

以下是文件应如何显示的示例，假设每个文件的行数约为1,000,000。

textfile1：

This is a the first line in English
This is a the 2nd line in English
This is a the third line in English

textfile2：

C'est la première ligne en Français
C'est la deuxième ligne en Français
C'est la troisième ligne en Français

期望的输出

This is a the first line in English\tC'est la première ligne en Français
This is a the 2nd line in English\tC'est la deuxième ligne en Français
This is a the third line in English\tC'est la troisième ligne en Français

这个Read two textfile line by line simultaneously -java有一个Java版本，但Python不使用逐行读取的bufferedreader。那怎么办呢？

Answer 1

from itertools import izip

with open("textfile1") as textfile1, open("textfile2") as textfile2: 
    for x, y in izip(textfile1, textfile2):
        x = x.strip()
        y = y.strip()
        print("{0}\t{1}".format(x, y))

在Python 3中，将itertools.izip替换为内置zip。

Answer 2

with open(file1) as f1, open(fil2) as f2:
  for x, y in zip(f1, f2):
     print("{0}\t{1}".format(x.strip(), y.strip()))

<强>输出：

This is a the first line in English C'est la première ligne en Français
This is a the 2nd line in English   C'est la deuxième ligne en Français
This is a the third line in English C'est la troisième ligne en Français

Answer 3

Python确实允许你逐行阅读，它甚至是默认行为 - 你只是遍历文件就像迭代列表一样。

wrt /一次迭代两次迭代，itertools.izip是你的朋友：

from itertools import izip
fileA = open("/path/to/file1")
fileB = open("/path/to/file2")
for lineA, lineB in izip(fileA, fileB):
    print "%s\t%s" % (lineA.rstrip(), lineB.rstrip())

Answer 4

我们可以使用generator来更方便地打开文件，并且可以轻松地支持同时迭代更多文件。

filenames = ['textfile1', 'textfile2']

def gen_line(filename):
    with open(filename) as f:
        for line in f:
            yield line.strip()

gens = [gen_line(n) for n in filenames]

for file1_line, file2_line in zip(*gens):
    print("\t".join(file1_line, file2_line))

注意：

这是python 3代码。对于python 2，请像其他人所说的那样使用itertools.izip。
zip将在最短文件被迭代后停止，如果重要，请使用itertools.zip_longest。

同时逐行读取两个文本文件

4 个答案: