在Python中将2个文本文件合并为一个具有2列的文本文件

时间:2018-12-18 19:34:47

标签: python

我有2个文本文件,每个文本文件具有相同的行数,我想将这2个文本文件合并到一个单独的csv文件中,分成2个具有附加行号字段的字段。在python中这可能吗?

File1:
This is a source first line 
This is a source second line
This is a source third line 

File2:
This is a transformed line 1
This is a transformed line 2
This is a transformed line 3 

Outputfile:
1,This is a source first line    ,This is a transformed line 1
2,This is a source second line   ,This is a transformed line 2
3,This is a source third  line   ,This is a transformed line 3

3 个答案:

答案 0 :(得分:1)

给出:

$ cat file1
This is a source first line 
This is a source second line
This is a source third line 
$ cat file2
This is a transformed line 1
This is a transformed line 2
This is a transformed line 3 

您可以这样做:

from itertools import izip_longest

with open(fn1) as f1, open(fn2) as f2:
    print '\n'.join(['{}: {}\t{}'.format(i,l1.strip(),l2.strip()) for i,(l1,l2) in enumerate(izip_longest(f1,f2),1)])

打印:

1: This is a source first line  This is a transformed line 1
2: This is a source second line This is a transformed line 2
3: This is a source third line  This is a transformed line 3

现在假设您已经:

$ cat file1
This is a source first line 
This is a source second line
This is a source third line 
$ cat file2
This is a transformed line 1
This is a transformed line 2
This is a transformed line 3 
This is line 4

您需要使输出为真列(通过使用{:40}设置40个字符的列值),并为fillvalue使用izip_longest

with open(fn1) as f1, open(fn2) as f2:
    print '\n'.join(['{}: {:40}{:40}'.format(i,l1.strip(),l2.strip()) for i,(l1,l2) in enumerate(izip_longest(f1,f2,fillvalue=""),1)])

打印:

1: This is a source first line             This is a transformed line 1            
2: This is a source second line            This is a transformed line 2            
3: This is a source third line             This is a transformed line 3            
4:                                         This is line 4     

答案 1 :(得分:0)

我们可以执行以下操作而无需导入。如果我们有两个文件:

File1:
This is a source first line 
This is a source second line
This is a source third line

File2:
This is a transformed line 1
This is a transformed line 2
This is a transformed line 3

然后...

with open("file1.txt") as f, open("file2.txt") as f2, open("outFile.txt", "w+") as o:
        lines = len(f.readlines())
        f.seek(0)
        for i in range(lines):
                o.write("{},{} \t\t,{}\n".format(i+1, f.readline().strip(), f2.readline().strip()))

说明: 我们打开两个阅读文件和一个写作文件。我们看到文件中有多少行。我们将行读取光标放回文件的顶部。然后,对于每一行,我们通过包括索引,第一个文件的行,制表符和逗号以及第二个文件的行,将其写入文件。我们的输出:

1,This is a source first line           ,This is a transformed line 1
2,This is a source second line          ,This is a transformed line 2
3,This is a source third line           ,This is a transformed line 3

答案 2 :(得分:0)

with open(r'C:/file1.txt') as f1, open(r'C:/file2.txt') as f2, open(r'C:/destination.txt', 'w') as o:
    for index, (line1, line2) in enumerate(zip(f1, f2), 1):
            o.write('{}:,{} ,{}\n'.format(index, line1.rstrip(), line2.rstrip()))

此解决方案的优点在于,它不会将整个文件读入内存,而是对输入文件中的每一行进行迭代,并且一次将它们写入输出文件中。我根据原始问题做出了一个假设,即两个文件的行数相同,但是如果没有,那么您将在此处使用zip_longest而不是zip。