使用数据框创建结构良好的熊猫数据框

时间:2020-08-11 16:19:11

标签: python pandas

我有一个从2018年到2020年的Panda DataFreme数据。我想按以下结构构建这些数据。

Month | 2018 | 2019
Jan     115    73
Feb     112    63
....

直到十二月。

如何使用熊猫数据框语法解决此问题?

Date
2018-01-01    115.0
2018-02-01    112.0
2018-03-01    104.5
2018-04-01     91.1
2018-05-01     85.5
2018-06-01     76.5
2018-07-01     86.5
2018-08-01     77.9
2018-09-01     65.0
2018-10-01     71.0
2018-11-01     76.0
2018-12-01     72.5
2019-01-01     73.0
2019-02-01     63.0
2019-03-01     63.0
2019-04-01     61.0
2019-05-01     58.3
2019-06-01     59.0
2019-07-01     67.0
2019-08-01     64.0
2019-09-01     59.9
2019-10-01     70.4
2019-11-01     78.9
2019-12-01     75.0
2020-01-01     73.9
Name: Close, dtype: float64

2 个答案:

答案 0 :(得分:2)

这更像是透视,但使用crosstab

s = pd.crosstab(df.index.strftime('%b'),df.index.year,df.values,aggfunc='sum')
Out[87]: 
col_0   2018  2019  2020
row_0                   
Apr     91.1  61.0   NaN
Aug     77.9  64.0   NaN
Dec     72.5  75.0   NaN
Feb    112.0  63.0   NaN
Jan    115.0  73.0  73.9
Jul     86.5  67.0   NaN
Jun     76.5  59.0   NaN
Mar    104.5  63.0   NaN
May     85.5  58.3   NaN
Nov     76.0  78.9   NaN
Oct     71.0  70.4   NaN
Sep     65.0  59.9   NaN

答案 1 :(得分:2)

您可以使用groupbyunstack

(s.groupby([s.index.month, s.index.year]).first().unstack()
  .rename_axis(columns='Year',index='Month')
)

输出:

Year    2018  2019  2020
Month                   
1      115.0  73.0  73.9
2      112.0  63.0   NaN
3      104.5  63.0   NaN
4       91.1  61.0   NaN
5       85.5  58.3   NaN
6       76.5  59.0   NaN
7       86.5  67.0   NaN
8       77.9  64.0   NaN
9       65.0  59.9   NaN
10      71.0  70.4   NaN
11      76.0  78.9   NaN
12      72.5  75.0   NaN