使用Python解析逗号分隔文件并添加货币字段

时间:2011-02-13 20:03:00

标签: python field delimited-text

我正在尝试使用Python来解析逗号分隔文件,其布局类似于:

AccountNumber,Invoice_Number,Gross_Amt,Adjustments,TotalDue

"234","56787","19.37",,"19.37"
"234","56788","204.76","-10.00","194.76"
"234","56789","139.77",,"139.77"
"567","12543","44.89","30.00","74.89"

我想要完成的是总金额,调整和总到期,然后将它们添加到每行的末尾(或者只是在每个文档的最后一行)。

我的问题是,只要帐号相同,我怎样才能创建只添加字段的变量?例如,我会说英语:

检查帐号:在每行上添加Gross_amt,而帐号等于上一行的帐号,然后当帐号更改时,将total_amt字段的总和作为新字段添加到最后一行的末尾帐户为Gross_Amt_Total。重新开始。

3 个答案:

答案 0 :(得分:4)

您可以使用itertools.groupby()

import csv
from itertools import groupby
from operator import itemgetter

with open("data.csv", "rb") as f:
    next(f)    # Skip header
    for account, lines in groupby(csv.reader(f), itemgetter(0)):
        gross_amount = 0.
        for line in lines:
            print line
            gross_amount += float(line[2])
        print "The total gross amount for account", account, "is", gross_amount

答案 1 :(得分:2)

csv模块用于读取数据,itertools.groupby按帐号分组:

import csv
from itertools import groupby
from StringIO import StringIO

data = StringIO('''\
AccountNumber,Invoice_Number,Gross_Amt,Adjustments,TotalDue
"234","56787","19.37",,"19.37"
"234","56788","204.76","-10.00","194.76"
"234","56789","139.77",,"139.77"
"567","12543","44.89","30.00","74.89"
''')

# Grab the header and rows of the data.
# groupby requires data sorted on the groupby key.
reader = csv.reader(data)
header = next(reader)
rows = sorted(reader)

print '{:13} {:14} {:9} {:11} {:8}'.format(*header)

# group by first item (acct number)
for acct,grp in groupby(rows,lambda r: r[0]):
    print
    gross_amt_total = 0
    adjustments_total = 0
    total_due_total = 0
    for item in grp:
        # everything comes in as a string, and blank strings don't cvt to float.
        gross = float(item[2]) if item[2] else 0.0
        adj = float(item[3]) if item[3] else 0.0
        due = float(item[4]) if item[4] else 0.0
        print '{:13} {:14} {:9.2f} {:11.2f} {:8.2f}'.format(item[0],item[1],gross,adj,due)
        gross_amt_total += gross
        adjustments_total += adj
        total_due_total += due
    print
    print 'Totals for #{:13}    {:9.2f} {:11.2f} {:8.2f}'.format(
        acct,gross_amt_total,adjustments_total,total_due_total)

输出

AccountNumber Invoice_Number Gross_Amt Adjustments TotalDue

234           56787              19.37        0.00    19.37
234           56788             204.76      -10.00   194.76
234           56789             139.77        0.00   139.77

Totals for #234                 363.90      -10.00   353.90

567           12543              44.89       30.00    74.89

Totals for #567                  44.89       30.00    74.89

答案 2 :(得分:0)

请参阅此处如何阅读csv。 http://docs.python.org/library/csv.html基本上你想做的是保留一个字典,将帐户映射到想要总计的值。对于文件中的每个记录,您可以使用帐号将dict编入索引,并将刚刚读取的值与dict中已有的值相加。