我需要合并两个csv文件,A.csv
和B.csv
,共同使用一个轴,提取:
9.358,3.0
9.388,2.0
和
8.551,2.0
8.638,2.0
我希望最终文件C.csv具有以下模式:
8.551,0.0,2.0
8.638,0.0,2.0
9.358,3.0,0.0
9.388,2.0,0.0
你建议怎么做?我应该去for循环吗?
答案 0 :(得分:3)
import numpy as np
dat1 = np.genfromtxt('dat1.txt', delimiter=',')
dat2 = np.genfromtxt('dat2.txt', delimiter=',')
dat1 = np.insert(dat1, 2, 0, axis=1)
dat2 = np.insert(dat2, 1, 0, axis=1)
dat = np.vstack((dat1, dat2))
np.savetxt('dat.txt', dat, delimiter=',', fmt='%.3f')
答案 1 :(得分:3)
只需从每个文件中读取,写出输出文件并添加“缺失”列:
import csv
with open('c.csv', 'wb') as outcsv:
# Python 3: use open('c.csv', 'w', newline='') instead
writer = csv.writer(outcsv)
# copy a.csv across, adding a 3rd column
with open('a.csv', 'rb') as incsv:
# Python 3: use open('a.csv', newline='') instead
reader = csv.reader(incsv)
writer.writerows(row + [0.0] for row in reader)
# copy b.csv across, inserting a 2nd column
with open('b.csv', 'rb') as incsv:
# Python 3: use open('b.csv', newline='') instead
reader = csv.reader(incsv)
writer.writerows(row[:1] + [0.0] + row[1:] for row in reader)
writer.writerows()
行完成所有工作;生成器表达式循环遍历每个reader
中的行,可以附加列,也可以在中间插入一列。
这适用于您拥有的任何大小的输入CSV,因为只有一些读取和写入缓冲区保存在内存中。行以迭代方式处理,无需将所有输入或输出文件保存在内存中。
答案 2 :(得分:2)
这是一个使用字典的简单解决方案,它适用于任何个文件:
from __future__ import print_function
def process(*filenames):
lines = {}
index = 0
for filename in filenames:
with open(filename,'rU') as f:
for line in f:
v1, v2 = line.rstrip('\n').split(',')
lines.setdefault(v1,{})[index] = v2
index += 1
for line in sorted(lines):
print(line, end=',')
for i in range(index):
print(lines[line].get(i,0.0), end=',' if i < index-1 else '\n')
process('A.csv','B.csv')
打印
8.551,0.0,2.0
8.638,0.0,2.0
9.358,3.0,0.0
9.388,2.0,0.0