我有2个文本文件,每个文本文件具有相同的行数,我想将这2个文本文件合并到一个单独的csv文件中,分成2个具有附加行号字段的字段。在python中这可能吗?
File1:
This is a source first line
This is a source second line
This is a source third line
File2:
This is a transformed line 1
This is a transformed line 2
This is a transformed line 3
Outputfile:
1,This is a source first line ,This is a transformed line 1
2,This is a source second line ,This is a transformed line 2
3,This is a source third line ,This is a transformed line 3
答案 0 :(得分:1)
给出:
$ cat file1
This is a source first line
This is a source second line
This is a source third line
$ cat file2
This is a transformed line 1
This is a transformed line 2
This is a transformed line 3
您可以这样做:
from itertools import izip_longest
with open(fn1) as f1, open(fn2) as f2:
print '\n'.join(['{}: {}\t{}'.format(i,l1.strip(),l2.strip()) for i,(l1,l2) in enumerate(izip_longest(f1,f2),1)])
打印:
1: This is a source first line This is a transformed line 1
2: This is a source second line This is a transformed line 2
3: This is a source third line This is a transformed line 3
现在假设您已经:
$ cat file1
This is a source first line
This is a source second line
This is a source third line
$ cat file2
This is a transformed line 1
This is a transformed line 2
This is a transformed line 3
This is line 4
您需要使输出为真列(通过使用{:40}
设置40个字符的列值),并为fillvalue
使用izip_longest
:
with open(fn1) as f1, open(fn2) as f2:
print '\n'.join(['{}: {:40}{:40}'.format(i,l1.strip(),l2.strip()) for i,(l1,l2) in enumerate(izip_longest(f1,f2,fillvalue=""),1)])
打印:
1: This is a source first line This is a transformed line 1
2: This is a source second line This is a transformed line 2
3: This is a source third line This is a transformed line 3
4: This is line 4
答案 1 :(得分:0)
我们可以执行以下操作而无需导入。如果我们有两个文件:
File1:
This is a source first line
This is a source second line
This is a source third line
File2:
This is a transformed line 1
This is a transformed line 2
This is a transformed line 3
然后...
with open("file1.txt") as f, open("file2.txt") as f2, open("outFile.txt", "w+") as o:
lines = len(f.readlines())
f.seek(0)
for i in range(lines):
o.write("{},{} \t\t,{}\n".format(i+1, f.readline().strip(), f2.readline().strip()))
说明: 我们打开两个阅读文件和一个写作文件。我们看到文件中有多少行。我们将行读取光标放回文件的顶部。然后,对于每一行,我们通过包括索引,第一个文件的行,制表符和逗号以及第二个文件的行,将其写入文件。我们的输出:
1,This is a source first line ,This is a transformed line 1
2,This is a source second line ,This is a transformed line 2
3,This is a source third line ,This is a transformed line 3
答案 2 :(得分:0)
with open(r'C:/file1.txt') as f1, open(r'C:/file2.txt') as f2, open(r'C:/destination.txt', 'w') as o:
for index, (line1, line2) in enumerate(zip(f1, f2), 1):
o.write('{}:,{} ,{}\n'.format(index, line1.rstrip(), line2.rstrip()))
此解决方案的优点在于,它不会将整个文件读入内存,而是对输入文件中的每一行进行迭代,并且一次将它们写入输出文件中。我根据原始问题做出了一个假设,即两个文件的行数相同,但是如果没有,那么您将在此处使用zip_longest而不是zip。