使用Python从CSV文件创建列表

时间:2010-11-08 17:41:49

标签: python csv

到目前为止,我有一个Python脚本可以完成我的操作...打开用户定义的CSV,将文件拆分为不同的预定义“池”,然后将它们重新映射到自己的文件中,并使用正确的标题。我唯一的问题是我想将Pool列表从静态更改为变量;并有一些问题。

池列表在自己的CSV中,在第2列中,可以复制。现在使用此设置,系统可以创建“死”文件,除了标题之外没有数据。

一些注意事项:是的我知道拼写不完美,是的,我知道我的一些评论有点不合适

import csv
#used to read ane make CSV's
import time
#used to timestamp files
import tkFileDialog
#used to allow user input
filename = tkFileDialog.askopenfilename(defaultextension = ".csv")
#Only user imput to locate the file it self
csvfile = [] 
#Declairs csvfile as a empty list
pools = ["1","2","4","6","9","A","B","D","E","F","I","K","L","M","N","O","P","W","Y"]
#declairs hte pools list for known pools
for i in pools:
    #uses the Pools List and makes a large number of variables
    exec("pool"+i+"=[]")
reader = csv.reader(open(filename, "rb"), delimiter = ',')
 #Opens the CSV for the reader to use
for row in reader: 
    csvfile.append(row) 
    #dumps the CSV into a varilable
    headers=[]
    #declairs headers as empty list
    headers.append(csvfile[0])
    #appends the first row to the header variable
for row in csvfile: 
    pool = str(row[1]).capitalize()
    #Checks to make sure all pools in the main data are capitalized
    if pool in pools:
        exec("pool"+pool+".append(row)")
        #finds the pool list and appends the new item into the variable list
    else: 
        pass
for i in pools:
    exec("wp=csv.writer(open('pool "+i+" "+time.strftime("%Y%m%d")+".csv','wb'),)")
    wp.writerows(headers)
    #Adds the header row
    exec("wp.writerows(pool"+i+")")
    #Created the CSV with a timestamp useing the pool list
    #-----Needs Headers writen in on each file -----

编辑: 因为有一些问题

代码的原因:我有正在生成的每日报告,其中一部分需要手动处理的报告将这些报告拆分为不同的池报告。我正在创建这个脚本,以便我可以快速选择它自己的文件并快速将它们分成自己的文件。

主CSV可以是50到100个项目,它总共有25个列,池总是会列在第二列。并非所有池都会一直列出,池将显示多次。

到目前为止,我尝试过几个不同的循环;一个如下

pools = [] for file in file(open(filename,'rb')): line = line.split() x =行[1] pools.append(x)的

但是我得到了一个List错误。

CSV的一个例子:

Ticket Pool Date Column 4 Column 5

1   A   11/8/2010   etc etc

2   A   11/8/2010   etc etc

3   1   11/8/2010   etc etc

4   6   11/8/2010   etc etc

5   B   11/8/2010   etc etc

6   A   11/8/2010   etc etc

7   1   11/8/2010   etc etc

8   2   11/8/2010   etc etc

9   2   11/8/2010   etc etc

10  1   11/8/2010   etc etc

3 个答案:

答案 0 :(得分:4)

如果我正确理解你在这里无法实现的目标,那么这可以作为解决方案:

import csv
import time
import tkFileDialog

filename = tkFileDialog.askopenfilename(defaultextension = ".csv")

reader = csv.reader(open(filename, "rb"), delimiter = ',')

headders = reader.next()

pool_dict = {}

for row in reader:
    if not pool_dict.has_key(row[1]):
        pool_dict[row[1]] = []
    pool_dict[row[1]].append(row)

for key, val in pool_dict.items():
    wp = csv.writer(open('pool ' +key+ ' '+time.strftime("%Y%m%d")+'.csv','wb'),)
    wp.writerow(headders)
    wp.writerows(val)

编辑:首先误解了标题和池事,并试图纠正这个问题。

编辑2:根据文件中找到的值更正了要动态创建的池。

如果没有,请提供您问题的更多详细信息......

答案 1 :(得分:2)

您能稍微描述一下您的CSV文件吗?

一个建议是改变

for i in pools:
#uses the Pools List and makes a large number of variables
    exec("pool"+i+"=[]")

更加pythonic形式:

pool_dict = {}
for i in pools:
    pool_dict[i] = []

一般来说,使用eval / exec很糟糕,并且更容易说通过字典循环。例如,通过pool_dict ['A'],pool_dict ['1']访问变量或循环遍历所有这些变量,如

for key,val in pool_dict.items():
   val.append(...)

编辑:现在看到CSV数据,尝试这样的事情:

for row in reader:
    if row[0] == 'Ticket':
        header = row
    else:
        cur_pool = row[1].capitalize()
        if not pool_dict.has_key(cur_pool):
            pool_dict[cur_pool] = [row,]
        else:
            pool_dict[cur_pool].append(row)

for p, pool_vals in pool_dict.items:
    with open('pool'+p+'_'+time.strftime("%Y%m%d")+'.csv','wb'),) as fp:
        wp = csv.writer(fp)
        wp.writerow(header)
        wp.writerows(pool_vals)

答案 2 :(得分:0)

如果没有所有这些高管,您的代码将更容易阅读。看起来你用它们来声明你所有的变量,实际上你可以像这样声明一个池列表:

pool_lists = [[] for p in pools]

这是我对“你想要将池列表从静态更改为变量”的意思的最佳猜测。执行此操作时,您将获得一个列表列表,其长度与池相同。