Question

我编写了以下代码来获取大型csv文件，并根据列中的特定单词将其拆分为多个csv文件。原始csv文件有一些字符串字段，并且它们周围有引号。

例如：

Field1,Field2,Field3,Field4
1,2,"red",3
1,4,"red",4
3,4,"blue",4

等

我的代码根据Field4将文件拆分为单独的csv。

我的输出如下：

3.csv
Field1,Field2,Field3,Field4
1,2,red,3

4.csv
Field1,Field2,Field3,Field4
1,4,red,4
3,4,blue,4

我希望我的输出能够在字段3中保持字符串周围的引号。文件被输入到一个软件中，只有当字符串在它们周围有引号时才会起作用，这非常烦人。

我目前的代码如下：

import csv

#Creates empty set - this will be used to store the values that have already been used
newfilelist = set()

#Opens the large csv file in "read" mode
with open('File.csv, 'r') as csvfile:

    #Read the first row of the large file and store the whole row as a string (headerstring)
    read_rows = csv.reader(csvfile)
    headerrow = next(read_rows)
    headerstring=','.join(headerrow) 
    for row in read_rows:

        #Store the whole row as a string (rowstring)
        rowstring=','.join(row)

        #Takes Field 4
        newfilename = (row[3])


        #This basically makes sure it is not looking at the header row.
        if newfilename != "field4":


            #If the newfilename is not in the newfilename set, add it to the list and create new csv file with header row.
            if newfilename not in newfilelist:    
                newfilelist.add(newfilename)
                with open('//output/' +str(newfilename)+'.csv','a') as f:
                    f.write(headerstring)
                    f.write("\n")
                    f.close()    
            #If the newfilename is in the newfilelist set, append the current row to the existing csv file.     
            else:
                with open('//output/' +str(newfilename)+'.csv','a') as f:
                    f.write(rowstring)
                    f.write("\n")
                    f.close()

有人可以告诉我如何获取字符串周围的引号吗？不幸的是，使用我的文件的软件要求它们采用这种格式！

Answer 1

将quoting=csv.QUOTE_NONNUMERIC传递给csv.writer()。

Answer 2

CSVwriter可能对你要做的事情有些过分。如果您希望整行保持不变，只需写下整行。

#Creates empty array - this will be used to store the values that have already been used
newfilelist = {}

#Opens the large csv file in "read" mode
with open('File.csv, 'r') as csvfile:

    #Read the first row of the large file and store the whole row as a string (headerstring)
    headerstring = csvfile.readline()
    for row in csvfile.readlines():

        #Takes Field 4
        newfilename = row.split(',')[3].strip('"')

        #If the newfilename is not in the newfilename set, add it to the list and create new csv file with header row.
        if newfilename not in newfilelist:    
            newfilelist[newfilename] = open('//output/' +str(newfilename)+'.csv','w'):  #open a file and store the file reference in an dictionary
            newfilelist[newfilename].write(headerstring)

        newfilelist[newfilename].write(row)  # Write out a row to an existing file

#Close all open files
for k in newfilelist.keys():
    newfilelist[k].close()

用字符串引号编写csv（Python）

2 个答案: