我需要一些帮助迭代python中的groupby对象。我将人们嵌套在一个ID变量下,然后在每个变量下,他们有3至6个月的余额。因此,打印groupby对象看起来像这样:
(1, Primary BP Product Rpt Month Closing Balance
0 1 CHECK 201708 10.04
1 1 CHECK 201709 11.1
2 1 CHECK 201710 11.16
3 1 CHECK 201711 11.22
4 1 CHECK 201712 11.28
5 1 CHECK 201801 11.34)
(2, Primary BP Product Rpt Month Closing Balance
79 2 CHECK 201711 52.42
85 2 CHECK 201712 31.56
136 2 CHECK 201801 99.91)
我想创建另一个列,根据第一笔金额标准化期末余额。所以理想的输出将如下所示:
(1, Primary BP Product Rpt Month Closing Balance standardized
0 1 CHECK 201708 10.04 0
1 1 CHECK 201709 11.1 1.1
2 1 CHECK 201710 11.16 1.16
3 1 CHECK 201711 11.22 1.22
4 1 CHECK 201712 11.28 1.28
5 1 CHECK 201801 11.34 1.34)
(2, Primary BP Product Rpt Month Closing Balance standardized
79 2 CHECK 201711 52.42 0
85 2 CHECK 201712 31.56 -20.86
136 2 CHECK 201801 99.91 47.79)
我只是无法弄清楚如何制作一个好的for循环,或者如果有任何其他方法,它将在groupby对象的组内迭代,取第一个值来结束余额并从每个中减去它结束平衡基本上可以创造差异分数。
答案 0 :(得分:0)
我解决了!仅两周后。没有使用groupby对象。方法如下:
bpid = []
diffs = []
# These two lines were just a bit of cleaning needed to make the vals numeric
data['Closing Balance'] = data['Closing Balance'].str.replace(",", "")
data['Closing Balance'] = pd.to_numeric(data['Closing Balance'])
# Create a new variable in monthly_data that simply shows the increase in closing balance for each month,
# setting the first month to 0
for index, row in data.iterrows():
bp = row[0]
if bp not in bpid:
bpid.append(bp)
first = row[3]
bal = row[3]
diff = round(bal-first, 2)
diffs.append(diff)
row['balance increase'] = diff
# Just checking to make sure there are the right number of values. Same as data, so good to go
print(len(diffs))
# Convert my list of differences in closing balance to a series object, and merge with the monthly_data
se = pd.Series(diffs)
data['balance increase'] = se.values