分割CSV文件时出现标题问题[Python 3]

时间:2018-11-12 12:29:18

标签: python csv split header

我是StackOverflow中的新手,因此,如果该帖子有任何形式上的错误,请纠正我,谢谢! 但是,回到主要主题:将大型CSV文件拆分为较小的文件时,标题出现了一些问题。总体思路是根据1列拆分提到的文件,并使用列名创建较小的文件,例如:

Fruit       Country       Color
apple       Poland        red
banana      Argentina     yellow
pineapple   Argentina     brown
pear        Poland        green
melon       Turkey        yellow
plum        Poland        violet
peach       Turkey        orange
grenade     Argentina     violet

代码应生成3个不同的文件(Poland.csv,Turkey.csv,Argentina.csv)

到目前为止,我已经完成了下面的代码,该代码可以正确分割CSV,但是不能正确附加标头(它们在每次迭代中都会添加)。您有什么想法我该如何处理?

import csv

opener = open('file.csv', 'r', encoding='utf-8')  
csvreader = csv.reader(opener, delimiter=';')        
header = next(csvreader)

def splitter(u):                                   
    for row in u:
        with open(row[1] + '.csv', 'a', encoding='utf-8', newline='') as myfile:
          writer = csv.writer(myfile, delimiter=';', quotechar='|', quoting=csv.QUOTE_MINIMAL)
          writer.writerow(header)
          writer.writerow(row)

    myfile.close()

splitter(csvreader)

2 个答案:

答案 0 :(得分:1)

尝试这样的操作(快速又脏,但应该可以工作):

def splitter(u):    
    filenames_already_opened = []     # Just keep a list of the csv's you've already created and therefore have added a header to.           
    for row in u:
        filename = row[1] + '.csv'
        with open(filename, 'a', encoding='utf-8', newline='') as myfile:
            writer = csv.writer(myfile, delimiter=';', quotechar='|', quoting=csv.QUOTE_MINIMAL)
            if filename in filenames_already_opened:  # Don't add a header if it's already got one.
                pass
            else:
                writer.writerow(header)
                filenames_already_opened.append(filename)
            writer.writerow(row)

    myfile.close()

答案 1 :(得分:0)

这可以解决问题:

import csv

opener = open('file.csv', 'r', encoding='utf-8')  
csvreader = csv.reader(opener, delimiter=';')        
header = next(csvreader)

def splitter(u):
    tableNames = []
    for row in u:
        with open(row[1] + '.csv', 'a', encoding='utf-8', newline='') as myfile:
            writer = csv.writer(myfile, delimiter=';', quotechar='|', quoting=csv.QUOTE_MINIMAL)
            if not row[1] in tableNames:
                writer.writerow(header)
                tableNames.append(row[1])
            writer.writerow(row)

    myfile.close()

splitter(csvreader)