我有3个csv文件,我想将这3个文件写入单个csv文件,如何实现。 例如
a b c d
1 2 3 4
5 6 7 8
e f g h
13 14 15 16
17 18 19 20
i j k l
9 10 11 12
21 22 23 24
所需的输出如下
a b c d e f g h i j k l
1 2 3 4 13 14 15 16 9 10 11 12
5 6 7 8 17 18 19 20 21 22 23 24
答案 0 :(得分:5)
您可以使用数据操作工具pandas。
import pandas as pd
df1 = pd.read_csv('file1.csv')
df2 = pd.read_csv('file2.csv')
df3 = pd.read_csv('file3.csv')
df_combined = pd.concat([df1, df2, df3],axis=1)
df_combined.to_csv('output.csv', index=None)
然后你得到组合的csv文件output.csv
答案 1 :(得分:1)
这些人是对的,你不应该要求代码。尽管如此,我发现这项任务足以让三分钟投入资金来解决这个问题:
import csv
allColumns = []
for dataFileName in [ 'a.csv', 'b.csv', 'c.csv' ]:
with open(dataFileName) as dataFile:
fileColumns = zip(*list(csv.reader(dataFile, delimiter=' ')))
allColumns += fileColumns
allRows = zip(*allColumns)
with open('combined.csv', 'w') as resultFile:
writer = csv.writer(resultFile, delimiter=' ')
for row in allRows:
writer.writerow(row)
请注意,此解决方案可能无法适用于大输入。它还假设所有文件都有相同数量的行(行),如果不是这样,可能会中断。
答案 2 :(得分:1)
(上述代码的略有改进版本)
import pandas as pd
files = ['file1.csv', 'file2.csv', 'file3.csv']
df_combined = pd.concat(map(pd.read_csv, files))
df_combined.to_csv('output.csv', index=None)
然后你得到组合的csv文件output.csv
paste -d" " file1.txt file2.txt
如果您使用的是UNIX类型操作系统,请检查您是否只关心合并文件how to merge two files consistently line by line
一帆风顺。
答案 3 :(得分:0)
一个想法可能是使用zip功能
file1 = "a b c d\n1 2 3 4\n5 6 7 8"
file2 = "e f g h\n13 14 15 16\n17 18 19 20"
file3 = "i j k l\n9 10 11 12\n21 22 23 24"
merged_file =[i+" " +j+" " +k for i,j,k in zip(file1.split('\n'),file2.split('\n'),file3.split('\n'))]
for i in merged_file:
print i
答案 4 :(得分:0)
考虑所有文件都有相同的行。此解决方案也适用于大输入,因为只有3行(每个文件一行)一次被带入内存。
import csv
with open('foo1.txt') as f1, open('foo2.txt') as f2, \
open('foo2.txt') as f3, open('out.txt', 'w') as f_out:
writer = csv.writer(f_out, delimiter=' ')
readers = [csv.reader(x, delimiter=' ') for x in (f1, f2, f3)]
while True:
try:
writer.writerow([y for w in readers for y in next(w)])
except StopIteration:
break
上述代码的基于for循环的版本,但这需要首先迭代其中一个文件以获取行数:
import csv
with open('foo1.txt') as f1, open('foo2.txt') as f2, \
open('foo2.txt') as f3, open('out.txt', 'w') as f_out:
writer = csv.writer(f_out, delimiter=' ')
lines = sum(1 for _ in f1) #Number of lines in f1
f1.seek(0) #Move the file pointer to the start of file
readers = [csv.reader(x, delimiter=' ') for x in (f1, f2, f3)]
for _ in range(lines):
writer.writerow([y for w in readers for y in next(w)])
答案 5 :(得分:0)
inputs = 'file1.csv', 'file2.csv', 'file3.csv'
with open('out.csv','w') as output:
for line in zip(*map(open, inputs)):
output.write('%s\n'%' '.join(i.strip() for i in line))
编辑:
这是一个详细的版本。
inputs = 'file1.csv', 'file2.csv', 'file3.csv'
# open all input files
inputs = map(open, inputs)
with open('out.csv','w') as output:
# iter over all the input files at the same time
for line in zip(*inputs):
# format the output line from input lines
line = ' '.join(i.strip() for i in line)
output.write('%s\n' % line)
答案 6 :(得分:0)
除了第一个答案,这是正确的答案之外,您可以通过以下方式处理文件夹中任意数量的csv文件更为通用:
import os
import pandas as pd
folder = r"C:\MyFolder"
frames = [pd.read_csv(os.path.join(folder,name) for name in os.listdir(folder) if name.endswith('.csv')]
merged = pd.concat(frames)
答案 7 :(得分:0)
首先考虑使用pandas模块,就像在waitingkuo的回答中一样。但我想你也可以使用DictWriter ......
import csv
# Initialize output file
header = [x for x in 'abcdefghijkl']
output = csv.DictWriter(open('final_output.csv', 'wb'), fieldnames = header)
output.writerow(dict(zip(header, header)))
# Compile contents of all three files into a single dictionary, outputdict
outputdict = {key:[] for key in header}
for fname in ['file1.csv', 'file2.csv', 'file3.csv']:
f = csv.DictReader(open(fname, 'r'))
[(outputdict[k]).append(line[k]) for k in line for line in f]
# Transfer the contents of outputdict into a csv file
[output.writerow(l) for l in outputdict]