我一直在尝试使用pandas计算移动平均值,但是当我使用Dataframe.rolling()。mean()时,它会复制该值。
stock_info['stock'].head()
Fecha Open High Low Close Volume 0 04-05-2007 00:00:00 234,4593 255,5703 234,3532 246,8906 6044574 1 07-05-2007 00:00:00 246,8906 254,7023 247,855 252,1563 2953869 2 08-05-2007 00:00:00 252,1562 250,7482 244,9617 250,1695 2007217 3 09-05-2007 00:00:00 250,1695 249,7838 245,9261 248,3757 2329078 4 10-05-2007 00:00:00 248,8194 248,9158 244,9617 245,6368 2138002
stock_info['stock']['MA'] = stock_info['stock']['Close'].rolling(window=2).mean()
Fecha Open High Low Close Volume MA 0 04-05-2007 00:00:00 234,4593 255,5703 234,3532 246,8906 6044574 246,8906 1 07-05-2007 00:00:00 246,8906 254,7023 247,855 252,1563 2953869 252,1563 2 08-05-2007 00:00:00 252,1562 250,7482 244,9617 250,1695 2007217 250,1695 3 09-05-2007 00:00:00 250,1695 249,7838 245,9261 248,3757 2329078 248,3757 4 10-05-2007 00:00:00 248,8194 248,9158 244,9617 245,6368 2138002 245,6368
答案 0 :(得分:4)
我的第一个想法是stock_info['stock']['Close']
中的值存储为字符串,而不是数字类型。试图
df['MA'] = df['Close'].rolling(window=2).mean()
on
df = pd.DataFrame({'Close': ['246,8906', '252,1563', '250,1695']})
给出
df
Out[38]:
Close MA
0 246,8906 246,8906
1 252,1563 252,1563
2 250,1695 250,1695
发生在你身上。
首先将其转换为数值,例如
df['MA'] = df['Close'].str.replace(',', '.').astype(float).rolling(window=2).mean()
给出
df
Out[40]:
Close MA
0 246,8906 NaN
1 252,1563 249.52345
2 250,1695 251.16290
根据需要。
答案 1 :(得分:0)
根据Pandas最新版本文档http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.rolling.html,您可以使用on
函数中的 rolling
参数。
df1 = pd.DataFrame({'val': range(10,30)})
df1['avg'] = df1.val.mean()
df1['rolling'] = df1.rolling(window=2, on='avg').mean()
而不是使用df1['avg'].rolling()
答案 2 :(得分:0)
您可以使用pd.rolling_mean来计算
示例:
df1 = pd.DataFrame([ np.random.randint(-10,10) for _ in xrange(100) ],columns =['val'])
val
0 4
1 -3
2 -7
3 3
4 -10
df1['MA'] = pd.rolling_mean(df1.val,2)
val MA
0 4 NaN
1 -3 0.5
2 -7 -5.0
3 3 -2.0
4 -10 -3.5