我是StackOverflow中的新手,因此,如果该帖子有任何形式上的错误,请纠正我,谢谢! 但是,回到主要主题:将大型CSV文件拆分为较小的文件时,标题出现了一些问题。总体思路是根据1列拆分提到的文件,并使用列名创建较小的文件,例如:
Fruit Country Color
apple Poland red
banana Argentina yellow
pineapple Argentina brown
pear Poland green
melon Turkey yellow
plum Poland violet
peach Turkey orange
grenade Argentina violet
代码应生成3个不同的文件(Poland.csv,Turkey.csv,Argentina.csv)
到目前为止,我已经完成了下面的代码,该代码可以正确分割CSV,但是不能正确附加标头(它们在每次迭代中都会添加)。您有什么想法我该如何处理?
import csv
opener = open('file.csv', 'r', encoding='utf-8')
csvreader = csv.reader(opener, delimiter=';')
header = next(csvreader)
def splitter(u):
for row in u:
with open(row[1] + '.csv', 'a', encoding='utf-8', newline='') as myfile:
writer = csv.writer(myfile, delimiter=';', quotechar='|', quoting=csv.QUOTE_MINIMAL)
writer.writerow(header)
writer.writerow(row)
myfile.close()
splitter(csvreader)
答案 0 :(得分:1)
尝试这样的操作(快速又脏,但应该可以工作):
def splitter(u):
filenames_already_opened = [] # Just keep a list of the csv's you've already created and therefore have added a header to.
for row in u:
filename = row[1] + '.csv'
with open(filename, 'a', encoding='utf-8', newline='') as myfile:
writer = csv.writer(myfile, delimiter=';', quotechar='|', quoting=csv.QUOTE_MINIMAL)
if filename in filenames_already_opened: # Don't add a header if it's already got one.
pass
else:
writer.writerow(header)
filenames_already_opened.append(filename)
writer.writerow(row)
myfile.close()
答案 1 :(得分:0)
这可以解决问题:
import csv
opener = open('file.csv', 'r', encoding='utf-8')
csvreader = csv.reader(opener, delimiter=';')
header = next(csvreader)
def splitter(u):
tableNames = []
for row in u:
with open(row[1] + '.csv', 'a', encoding='utf-8', newline='') as myfile:
writer = csv.writer(myfile, delimiter=';', quotechar='|', quoting=csv.QUOTE_MINIMAL)
if not row[1] in tableNames:
writer.writerow(header)
tableNames.append(row[1])
writer.writerow(row)
myfile.close()
splitter(csvreader)