DF:
fruits date amount
0 Apple 2018-01-01 100
1 Orange 2018-01-01 200
2 Apple 2018-01-01 150
3 Apple 2018-01-02 100
4 Orange 2018-01-02 100
5 Orange 2018-01-02 100
创建此代码的代码:
f = [["Apple","2018-01-01",100],["Orange","2018-01-01",200],["Apple","2018-01-01",150],
["Apple","2018-01-02",100],["Orange","2018-01-02",100],["Orange","2018-01-02",100]]
df = pd.DataFrame(f,columns = ["fruits","date","amount"])
我正在尝试汇总每个日期的水果销售量,并找出总和之间的差异
预期操作:
date diff
2018-01-01 . 50
2018-01-02 . -100
在查找苹果和橙子的销售额之和,并找出两者之差
我能够找到总和:
df.groupby(["date","fruits"])["amount"].agg("sum")
date fruits
2018-01-01 Apple 250
Orange 200
2018-01-02 Apple 100
Orange 200
Name: amount, dtype: int64
关于如何发现熊猫本身差异的任何建议。
答案 0 :(得分:1)
a
输出
b
答案 1 :(得分:1)
df = df.groupby(["date","fruits"])["amount"].sum().unstack()
df['diff'] = df.pop('Apple') - df.pop('Orange')
print (df)
fruits diff
date
2018-01-01 50
2018-01-02 -100
答案 2 :(得分:0)
将groupby
用作date
apply
和lambda function
,
df.groupby("date").apply(lambda x: x.loc[x['fruits']=='Apple','amount'].sum() -
x.loc[x['fruits']=='Orange','amount'].sum())
date
2018-01-01 50
2018-01-02 -100
dtype: int64
或将水果分别分组并找出差异:
A = df[df.fruits.isin(['Apple'])].groupby('date')['amount'].sum()
O = df[df.fruits.isin(['Orange'])].groupby('date')['amount'].sum()
O-A
date
2018-01-01 -50
2018-01-02 100
Name: amount, dtype: int64