从csv

时间:2015-06-11 20:43:48

标签: python csv

如果用户相同,我试图从CSV文件中添加特定值。我无法解释清楚,所以我会试着告诉你。

=====================
|E-mail  | M-count |
|a@a.com | 12      |
|b@a.com | 8       |
|a@a.com | 13      |
|c@a.com | 2       |
=====================

然后它尝试添加属于特定用户的所有内容:

=====================
|E-mail  | Total   |
|a@a.com | 25      |
|b@a.com | 8       |
|c@a.com | 2       |
=====================

我拆分了CSV并在一个集合中添加了我需要的值,但我想不出添加我需要的值的方法。有什么想法吗?

编辑:

这就是我的CSV的样子:

p_number,duration,clnup#
5436715524,00:02:26,2
6447654246,00:17:18,5
5996312484,00:01:19,1
5436715524,00:10:12,6

我想得到每个唯一p_number的总持续时间和总clnup#。我很抱歉这个混乱,但上面的表只是一个例子。

2 个答案:

答案 0 :(得分:1)

您可以使用OrderedDict将名称存储为值并随时更新计数:

import csv
from collections import OrderedDict

od = OrderedDict()

with open("test.txt") as f:
    r = csv.reader(f)
    head = next(r)
    for name,val in r:
        od.setdefault(name, 0)
        od[name]  += int(val)

print(od)
OrderedDict([('a@a.com', 25), ('b@a.com', 8), ('c@a.com', 2)])

要更新原始文件,您可以写入NamedTemporaryFile,然后使用shutil.move在使用od.items写入带有writerows的行后替换原始文件:

import csv
from collections import OrderedDict
from shutil import move
from tempfile import NamedTemporaryFile
od = OrderedDict()

with open("test.txt") as f, NamedTemporaryFile(dir=".",delete=False) as out:
    r = csv.reader(f)
    wr = csv.writer(out)
    head = next(r)
    wr.writerow(head)
    for name,val in r:
        od.setdefault(name, 0)
        od[name]  += int(val)
    wr.writerows(od.iteritems())


move(out.name,"test.txt")

输出:

E-mail,M-count
a@a.com,25
b@a.com,8
c@a.com,2

如果您不关心订单,请使用defaultdict:

import csv

from collections import defaultdict
from shutil import move
from tempfile import NamedTemporaryFile
od = defaultdict(int)

with open("test.txt") as f, NamedTemporaryFile(dir=".",delete=False) as out:
    r = csv.reader(f)
    wr = csv.writer(out)
    head = next(r)
    wr.writerow(head)
    for name,val in r:
        od[name]  += int(val)
    wr.writerows(od.iteritems())

答案 1 :(得分:0)

import csv

ifile = open('sample.csv', 'rb')
csv_reader = csv.reader(ifile)

d = {}
for row in csv_reader:
    d[row[0]] = int(row[1]) if d.get(row[0], None) is None else d[row[0]] + int(row[1])
from pprint import pprint
pprint(d)