尝试将一个数据框中的列与多年的每日时间索引划分为按日期索引的第二个数据框的列。例如,每天编制一个数据框索引。按日期制作第二个具有中值的数据框。
import pandas as pd
import numpy.random as npr
rng = pd.date_range('1/1/2010', periods=365*5, freq='D')
df1 = pd.DataFrame(npr.randn(len(rng)), index=rng)
df_med = df1.groupby(lambda x: x.dayofyear).median()
我想将df1除以df_med,以便生成一个数据的数据框,该数据框由一年中某一天的中值标准化。
df_norm = df1.div(df_med, axis=1)
这不起作用,但我担心我不知道是什么。有什么想法吗?
答案 0 :(得分:0)
除法不起作用的原因是因为分组的df是不同的形状,所以一种方法是调用transform
以便它返回一个索引与原始df对齐的系列然后你执行分工:
In [165]:
import numpy.random as npr
rng = pd.date_range('1/1/2010', periods=365*5, freq='D')
df1 = pd.DataFrame(npr.randn(len(rng)), index=rng)
print(df1.info())
df_med = df1.groupby(lambda x: x.dayofyear).median()
df1.div(df1.groupby(lambda x: x.dayofyear).transform(pd.Series.median))
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1825 entries, 2010-01-01 to 2014-12-30
Freq: D
Data columns (total 1 columns):
0 1825 non-null float64
dtypes: float64(1)
memory usage: 28.5 KB
None
Out[165]:
0
2010-01-01 -0.323354
2010-01-02 -1.148487
2010-01-03 -0.206003
2010-01-04 1.768663
2010-01-05 -25.856032
2010-01-06 10.113401
2010-01-07 -0.754476
2010-01-08 1.271442
2010-01-09 -0.845800
2010-01-10 2.037104
2010-01-11 -7.730482
2010-01-12 10.873351
2010-01-13 0.924056
2010-01-14 1.000000
2010-01-15 -2.764203
2010-01-16 1.205966
2010-01-17 1.775265
2010-01-18 4.983361
2010-01-19 -17.537263
2010-01-20 1.000000
2010-01-21 1.000000
2010-01-22 2.176172
2010-01-23 -2.442958
2010-01-24 -3.126872
2010-01-25 -1.612845
2010-01-26 13.099342
2010-01-27 -1.683263
2010-01-28 1.000000
2010-01-29 0.225677
2010-01-30 7.862236
... ...
2014-12-01 -71.974731
2014-12-02 1.000000
2014-12-03 -0.975790
2014-12-04 -4.715373
2014-12-05 1.000000
2014-12-06 -1.111680
2014-12-07 0.522016
2014-12-08 3.233062
2014-12-09 -228.056902
2014-12-10 1.342591
2014-12-11 -11.872743
2014-12-12 1.000000
2014-12-13 -75.493044
2014-12-14 0.354384
2014-12-15 0.248133
2014-12-16 -2.483432
2014-12-17 1.000000
2014-12-18 -2.942194
2014-12-19 0.561869
2014-12-20 2.421608
2014-12-21 1.629229
2014-12-22 4.050602
2014-12-23 -1.040709
2014-12-24 1.000000
2014-12-25 -7.681764
2014-12-26 1.032772
2014-12-27 13.222927
2014-12-28 -8.698441
2014-12-29 1.658290
2014-12-30 -0.951775
[1825 rows x 1 columns]