我如何处理数据框字典?或者,有没有更好的方法来概述我的数据?如果我有例如:
Fruit Qty Year
Apple 2 2016
Orange 1 2017
Mango 2 2016
Apple 9 2016
Orange 8 2015
Mango 7 2016
Apple 6 2016
Orange 5 2017
Mango 4 2015
然后我试图找出每年总共得到多少,例如:
2015 2016 2017
Apple 0 11 0
Orange 8 0 6
Mango 4 9 0
我已经写了一些代码,但它可能没什么用处:
import pandas as pd
# Fruit Data
df_1 = pd.DataFrame({'Fruit':['Apple','Orange','Mango','Apple','Orange','Mango','Apple','Orange','Mango'], 'Qty': [2,1,2,9,8,7,6,5,4], 'Year': [2016,2017,2016,2016,2015,2016,2016,2017,2015]})
# Create a list of Fruits
Fruits = df_1.Fruit.unique()
# Break down the dataframe by Year
df_2015 = df_1[df_1['Year'] == 2015]
df_2016 = df_1[df_1['Year'] == 2016]
df_2017 = df_1[df_1['Year'] == 2017]
# Create a dataframe dictionary of Fruits
Dict_2015 = {elem : pd.DataFrame for elem in Fruits}
Dict_2016 = {elem : pd.DataFrame for elem in Fruits}
Dict_2017 = {elem : pd.DataFrame for elem in Fruits}
# Store the Qty for each Fruit x each Year
for Fruit in Dict_2015.keys():
Dict_2015[Fruit] = df_2015[:][df_2015.Fruit == Fruit]
for Fruit in Dict_2016.keys():
Dict_2016[Fruit] = df_2016[:][df_2016.Fruit == Fruit]
for Fruit in Dict_2017.keys():
Dict_2017[Fruit] = df_2017[:][df_2017.Fruit == Fruit]
答案 0 :(得分:3)
您可以使用pandas.pivot_table
。
res = df.pivot_table(index='Fruit', columns=['Year'], values='Qty',
aggfunc=np.sum, fill_value=0)
print(res)
Year 2015 2016 2017
Fruit
Apple 0 17 0
Mango 4 9 0
Orange 8 0 6
有关使用的指导,请参阅How to pivot a dataframe。
答案 1 :(得分:2)
jpp 已经以您想要的格式发布了答案。但是,由于您的问题似乎对其他观点持开放态度,我想到了另一种方式。不完全是你发布的格式,但我通常这样做。
df = df.groupby(['Fruit', 'Year']).agg({'Qty': 'sum'}).reset_index()
这看起来像是:
Fruit Year Sum
Apple 2015 0
Apple 2016 11
Apple 2017 0
Orange 2015 8
Orange 2016 0
Orange 2017 6
Mango 2015 4
Mango 2016 9
Mango 2017 0