将CSV列内容拆分为多个列

时间:2016-12-14 22:12:54

标签: python csv split

我正在尝试将另一部分绑定到我现有的Python程序中。我是Python的新手,即使有了所有的帮助,也无法弄明白。我将在下面列出我现有的Python程序,我只想添加另一篇文章来执行另一项任务。

当前程序打开" initial.csv"并在第一列中查找任何关键词。如果它匹配一个,它将该行写入" listname_rejects.csv"如果不匹配,它会写入" listname.csv"。它听起来倒退,但对于我正在做的事情,它是正确的。我已经用了一千次了。

现在,我想补充一点的是能够查看第2列(地址满)并将它们拆分为单独的列。例如,这个 -

Name,Address,Phonenumber,ID
John,"123 Any Street, New York, NY 00010",999-999-9999,321654

变成了这个 -

Name,Street,City,State,Zipcode,Phonenumber,ID
John,123 Any Street, New York, NY, 00010,999-999-9999,321654

基本上,我需要能够将第二列分成不同的列。我不需要将第2列中的整个地址放在第2列,第3列,第4列和第2列之间。 5。

我在堆栈溢出时发现了与此相近的东西,但同样,我是Python的新手,并且无法弄清楚如何将它们分成我当前的代码。

key_words = [ 
'Suzy', 
'Billy', 
'Cody',
 ]

listname = raw_input ("Enter List Name:")
listname_accept = (listname) + '.csv'
listname_rejects = (listname) + '_rejected.csv'

with open('initial.csv', 'r') as oldfile, open(listname_accept, 'w') as cleaned:
    for line in oldfile:
        if not any(key_word in line.split(",", 1)[0] for key_word in key_words):
            cleaned.write(line)      
        else:
            matched.write(line)

3 个答案:

答案 0 :(得分:1)

如果有效,请告诉我,我可能会混淆您的输出csv名称,但您可以根据您的逻辑调整这些名称:

import csv

key_words = [ 
'Suzy', 
'Billy', 
'Cody',
 ]

listname = raw_input ("Enter List Name:")
listname_accept = (listname) + '.csv'
listname_rejects = (listname) + '_rejected.csv'

with open('initial.csv') as oldfile, open(listname_accept,'w') as cleaned, open(listname_rejects,'w') as matched:
    accept_writer=csv.writer(cleaned) # create one csv writer object
    reject_writer=csv.writer(matched) # create second csv writer object
    initial_reader=csv.reader(oldfile)
    for c,row in enumerate(initial_reader): # read through input csv
        if c==0:                            # first row is the header
            header=row[:]
            del header[1]       # delete 'address'
            header[1:1]=['Street','City','State','Zipcode'] # insert these column names
            accept_writer.writerow(header)                  # write column names to csv
            reject_writer.writerow(header)                  # write column names to csv
        else:                                               # for all other input rows, except the first
            address_list=[i.strip() for i in row[1].split(',')] # split the address by comma
            all_address=address_list[:-1]+address_list[-1].split() # split the state and zip by space
            del row[1]                                             # delete original string address from row
            row[1:1]=all_address                                   # insert new address
            if row[0] not in key_words:                            # test if name in key_words
                accept_writer.writerow(row)
            else:
                reject_writer.writerow(row)

我已插入评论,以帮助您了解正在发生的事情。

答案 1 :(得分:0)

希望下一个代码可以帮助您: 我已经放置了自己的csv文件名,但您可以自定义它们 主要的想法是,您可以在文件中创建包含所需列的csv并正确拆分字符串

此致

import csv

to_validate = ["name1", "name2"]

"""
file_to_read.csv has
Name,Address,Phonenumber,ID
John,"123 Any Street, New York, NY 00010",999-999-9999,321654
"""

file_to_read = csv.DictReader(open("file_to_read.csv", 'r'), delimiter=',', quotechar='"')
headers_wrote = False


for row in file_to_read:
    if row["Name"] in to_validate:
        # do some stufs
        pass
    else:
        to_write = {
            "Name": row["Name"],
            "Street": row["Address"].split(",")[0].strip(),
            "City": row["Address"].split(",")[1].strip(),
            "State": row["Address"].split(",")[2].strip().split(" ")[0].strip(),
            "Zipcode": row["Address"].split(",")[2].strip().split(" ")[1].strip(),
            "Phonenumber": row["Phonenumber"],
            "ID": row["ID"]
        }
        with open("example_file.csv", 'w+') as csvfile:
            if not headers_wrote:
                fieldnames = ["Name", "Street", "City", "State", "Zipcode", "Phonenumber", "ID"]
                writer = csv.DictWriter(csvfile, fieldnames = fieldnames, delimiter = ",")
                writer.writeheader()
                writer.writerow(to_write)
                headers_wrote = True
            else:
                writer = csv.DictWriter(csvfile, fieldnames = fieldnames, delimiter = ",")
                writer.writerow(to_write)

答案 2 :(得分:0)

虽然这个问题已经回答了,我觉得你应该用pandas模块拓宽你的知识面。我只实现了拆分地址行的部分。如果你愿意,我也可以告诉你其余部分。 pandas有时可能并不简单,但是一旦你习惯了它,这是解决许多 csv处理问题的最简单方法(更不用说其他很棒的功能了:使用数据库)等等。)。该代码显示在我的github页面上。看看吧!