我正在尝试使用Python来解析逗号分隔文件,其布局类似于:
AccountNumber,Invoice_Number,Gross_Amt,Adjustments,TotalDue
"234","56787","19.37",,"19.37"
"234","56788","204.76","-10.00","194.76"
"234","56789","139.77",,"139.77"
"567","12543","44.89","30.00","74.89"
我想要完成的是总金额,调整和总到期,然后将它们添加到每行的末尾(或者只是在每个文档的最后一行)。
我的问题是,只要帐号相同,我怎样才能创建只添加字段的变量?例如,我会说英语:
检查帐号:在每行上添加Gross_amt,而帐号等于上一行的帐号,然后当帐号更改时,将total_amt字段的总和作为新字段添加到最后一行的末尾帐户为Gross_Amt_Total。重新开始。
答案 0 :(得分:4)
您可以使用itertools.groupby()
:
import csv
from itertools import groupby
from operator import itemgetter
with open("data.csv", "rb") as f:
next(f) # Skip header
for account, lines in groupby(csv.reader(f), itemgetter(0)):
gross_amount = 0.
for line in lines:
print line
gross_amount += float(line[2])
print "The total gross amount for account", account, "is", gross_amount
答案 1 :(得分:2)
csv模块用于读取数据,itertools.groupby按帐号分组:
import csv
from itertools import groupby
from StringIO import StringIO
data = StringIO('''\
AccountNumber,Invoice_Number,Gross_Amt,Adjustments,TotalDue
"234","56787","19.37",,"19.37"
"234","56788","204.76","-10.00","194.76"
"234","56789","139.77",,"139.77"
"567","12543","44.89","30.00","74.89"
''')
# Grab the header and rows of the data.
# groupby requires data sorted on the groupby key.
reader = csv.reader(data)
header = next(reader)
rows = sorted(reader)
print '{:13} {:14} {:9} {:11} {:8}'.format(*header)
# group by first item (acct number)
for acct,grp in groupby(rows,lambda r: r[0]):
print
gross_amt_total = 0
adjustments_total = 0
total_due_total = 0
for item in grp:
# everything comes in as a string, and blank strings don't cvt to float.
gross = float(item[2]) if item[2] else 0.0
adj = float(item[3]) if item[3] else 0.0
due = float(item[4]) if item[4] else 0.0
print '{:13} {:14} {:9.2f} {:11.2f} {:8.2f}'.format(item[0],item[1],gross,adj,due)
gross_amt_total += gross
adjustments_total += adj
total_due_total += due
print
print 'Totals for #{:13} {:9.2f} {:11.2f} {:8.2f}'.format(
acct,gross_amt_total,adjustments_total,total_due_total)
AccountNumber Invoice_Number Gross_Amt Adjustments TotalDue
234 56787 19.37 0.00 19.37
234 56788 204.76 -10.00 194.76
234 56789 139.77 0.00 139.77
Totals for #234 363.90 -10.00 353.90
567 12543 44.89 30.00 74.89
Totals for #567 44.89 30.00 74.89
答案 2 :(得分:0)
请参阅此处如何阅读csv。 http://docs.python.org/library/csv.html基本上你想做的是保留一个字典,将帐户映射到想要总计的值。对于文件中的每个记录,您可以使用帐号将dict编入索引,并将刚刚读取的值与dict中已有的值相加。