Question

我试图弄清楚如何为我创建的Python Pandas Pivot表中的每一行计算平均值。

我还想将每年的总和添加到数据透视表的底部。

我要做的最后一步是获取上面计算出的每个月的平均值，然后将其除以总平均值，以获得每年的平均分布。

import pandas as pd 
import pandas_datareader.data as web
import datetime

start = datetime.datetime(2011, 1, 1)
end = datetime.datetime(2017, 12, 31)

libor = web.DataReader('USD1MTD156N', 'fred', start, end) # Reading the data
libor = libor.dropna(axis=0, how= 'any') # Dropping the NAN values
libor = libor.resample('M').mean() # Calculating the mean value per date
libor['Month'] = pd.DatetimeIndex(libor.index).month # Adding month value after each 
libor['Year'] = pd.DatetimeIndex(libor.index).year # Adding month value after each 

pivot = libor.pivot(index='Month',columns='Year',values='USD1MTD156N')
print pivot

任何建议如何进行？预先谢谢你

Answer 1

我想这就是您想要的（这在python3上-我认为此脚本中只有print命令是不同的）：

# Mean of each row
ave_month = pivot.mean(1)
#sum of each year at the bottom of the pivot table.
sum_year = pivot.sum(0)
# average distribution per year.
ave_year = sum_year/sum_year.mean()
print(ave_month, '\n', sum_year, '\n', ave_year)
Month
1     0.324729
2     0.321348
3     0.342014
4     0.345907
5     0.345993
6     0.369418
7     0.382524
8     0.389976
9     0.392838
10    0.392425
11    0.406292
12    0.482017
dtype: float64 
 Year
2011     2.792864
2012     2.835645
2013     2.261839
2014     1.860015
2015     2.407864
2016     5.953718
2017    13.356432
dtype: float64 
 Year
2011    0.621260
2012    0.630777
2013    0.503136
2014    0.413752
2015    0.535619
2016    1.324378
2017    2.971079
dtype: float64

Answer 2

我将在pivot上使用pivot_table，然后使用aggfunc参数。

pivot = libor.pivot(index='Month',columns='Year',values='USD1MTD156N')

将会

import numpy as np
pivot = libor.pivot_table(index='Month',columns='Year',values='USD1MTD156N', aggfunc=np.mean)

如果我没记错的话，你应该也可以删除重采样语句

文档链接：

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.pivot_table.html

Python Pandas数据透视表计算

2 个答案: