year roe
33 1987 0.114370
34 1988 0.141086
35 1989 0.144621
36 1990 0.135348
37 1991 0.076381
我试图根据每年的价格来衡量roe,但是所有的价值都变成了NaN,就像这样:
from scipy.stats.mstats import winsorize
grouped=t.groupby('year')
t['roe_w']=grouped['roe'].apply(winsorize,limits=[0.01,0.01])
t.roe_w.head()
33 NaN
34 NaN
35 NaN
36 NaN
37 NaN
Name: roe_w, dtype: object
我尝试定义要应用的新函数,但又得到了相同的结果。我不能直接winsorize groupby对象
如果我这样做,我不知道如何将结果添加到原始数据帧...
t.groupby('year').roe.apply(winsorize,limits=[0.01,0.01])
Out[57]:
year
1965 [0.14415390338, 0.117462079171, 0.128658847171...
1966 [0.150060493701, nan, 0.12508992087, 0.1676523...
1967 [0.169752998571, nan, 0.128173648284, 0.117999...
1968 [0.120201520849, nan, 0.162525114893, 0.137850...
1969 [0.112350791976, 0.114198765054, 0.18420285951...
1970 [0.0939597998646, 0.181250926338, 0.0782947478...
1971 [0.0790746582545, 0.184407887041, 0.0254160825...
1972 [0.0930312074128, 0.201296434986, 0.0800409603...
我试过这个功能,效果很好,但结果与stata略有不同。所以我仍然必须使用scipy的winsorize来获得与stata相同的结果。
def winsorize_series(se):
q = se.quantile([0.01, 0.99])
if isinstance(q, pd.Series) and len(q) == 2:
se[se < q.iloc[0]] = q.iloc[0]
se[se > q.iloc[1]] = q.iloc[1]
return se
HELP!救命! QAQ