我有两个文件,第一个叫做book1.csv,看起来像这样:
header1,header2,header3,header4,header5
1,2,3,4,5
1,2,3,4,5
1,2,3,4,5
第二个文件名为book2.csv,如下所示:
header1,header2,header3,header4,header5
1,2,3,4
1,2,3,4
1,2,3,4
我的目标是将book1.csv中包含5的列复制到book2.csv中的相应列。
我的代码的问题似乎是它没有正确附加,也没有选择我想要复制的索引。它还给出了一个错误,我选择了一个不正确的索引位置。输出如下:
header1,header2,header3,header4,header5
1,2,3,4
1,2,3,4
1,2,3,41,2,3,4,5
这是我的代码:
import csv
with open('C:/Users/SAM/Desktop/book2.csv','a') as csvout:
write=csv.writer(csvout, delimiter=',')
with open('C:/Users/SAM/Desktop/book1.csv','rb') as csvfile1:
read=csv.reader(csvfile1, delimiter=',')
header=next(read)
for row in read:
row[5]=write.writerow(row)
我该怎么做才能让它正确追加?
感谢您的帮助!
答案 0 :(得分:4)
这样的事情怎么样?我在两本书中都读过,将book1
的最后一个元素附加到book2
行book2
中的每一行,我将其存储在列表中。然后我将该列表的内容写入新的.csv
文件。
with open('book1.csv', 'r') as book1:
with open('book2.csv', 'r') as book2:
reader1 = csv.reader(book1, delimiter=',')
reader2 = csv.reader(book2, delimiter=',')
both = []
fields = reader1.next() # read header row
reader2.next() # read and ignore header row
for row1, row2 in zip(reader1, reader2):
row2.append(row1[-1])
both.append(row2)
with open('output.csv', 'w') as output:
writer = csv.writer(output, delimiter=',')
writer.writerow(fields) # write a header row
writer.writerows(both)
答案 1 :(得分:1)
关于“我选择了错误的索引位置的错误”,我怀疑这是因为您在代码中使用了row[5]
。 Python中的索引从0开始,所以如果你有A = [1, 2, 3, 4, 5]
然后获得5,那么你将print(A[4])
。
假设两个文件的行数相同且行的顺序相同,我想你想做这样的事情:
import csv
# Open the two input files, which I've renamed to be more descriptive,
# and also an output file that we'll be creating
with open("four_col.csv", mode='r') as four_col, \
open("five_col.csv", mode='r') as five_col, \
open("five_output.csv", mode='w', newline='') as outfile:
four_reader = csv.reader(four_col)
five_reader = csv.reader(five_col)
five_writer = csv.writer(outfile)
_ = next(four_reader) # Ignore headers for the 4-column file
headers = next(five_reader)
five_writer.writerow(headers)
for four_row, five_row in zip(four_reader, five_reader):
last_col = five_row[-1] # # Or use five_row[4]
four_row.append(last_col)
five_writer.writerow(four_row)
答案 2 :(得分:1)
尽管上面的一些代码可行,但它并不是真正可扩展的,需要一种矢量化方法。与numpy或pandas一起工作将使这些任务变得更容易,因此学习它有点好。
您可以从Pandas Website
下载pandas# Load Pandas
from pandas import DataFrame
# Load each file into a pandas dataframe, this is based on a numpy array
data1 = DataFrame.from_csv('csv1.csv',sep=',',parse_dates=False)
data2 = DataFrame.from_csv('csv2.csv',sep=',',parse_dates=False)
#Now add 'header5' from data1 to data2
data2['header5'] = data1['header5']
#Save it back to csv
data2.to_csv('output.csv')
答案 3 :(得分:0)
为什么不逐行读取文件并使用-1索引查找最后一项?
endings=[]
with open('book1.csv') as book1:
for line in book1:
# if not header line:
endings.append(line.split(',')[-1])
linecounter=0
with open('book2.csv') as book2:
for line in book2:
# if not header line:
print line+','+str(endings[linecounter]) # or write to file
linecounter+=1
如果行号不匹配,您还应该发现错误。