我有这个数据框:
avg date high low qty
0 16.92 2013-05-27 00:00:00 19.00 1.22 71151.00
1 14.84 2013-05-30 00:00:00 19.00 1.22 42939.00
2 9.19 2013-06-02 00:00:00 17.20 1.23 5607.00
3 23.63 2013-06-05 00:00:00 5000.00 1.22 5850.00
4 13.82 2013-06-10 00:00:00 19.36 1.22 5644.00
5 17.76 2013-06-15 00:00:00 24.00 2.02 16969.00
每一行都是对在指定日期创建的平均值,高,低和数量的观察。
我正在尝试计算一个跨度为60天的指数移动加权平均值:
df["emwa"] = pandas.ewma(df["avg"],span=60,freq="D")
但是我得到了
TypeError: Only valid with DatetimeIndex or PeriodIndex
好吧,也许我需要在构建时将DataTimeIndex添加到我的DataFrame中。让我从
更改构造函数调用df = pandas.DataFrame(records) #records is just a list of dictionaries
到
rng = pandas.date_range(firstDate,lastDate, freq='D')
df = pandas.DataFrame(records,index=rng)
但现在我得到了
ValueError: Shape of passed values is (5,), indices imply (5, 1641601)
有关如何计算我的EMWA的任何建议吗?
答案 0 :(得分:10)
您需要两件事,确保日期列是日期(而不是字符串),并将索引设置为这些日期。
您可以使用to_datetime
:
In [11]: df.index = pd.to_datetime(df.pop('date'))
In [12]: df
Out[12]:
avg high low qty
date
2013-05-27 16.92 19.00 1.22 71151
2013-05-30 14.84 19.00 1.22 42939
2013-06-02 9.19 17.20 1.23 5607
2013-06-05 23.63 5000.00 1.22 5850
2013-06-10 13.82 19.36 1.22 5644
2013-06-15 17.76 24.00 2.02 16969
然后您可以按预期调用emwa
:
In [13]: pd.ewma(df["avg"], span=60, freq="D")
Out[13]:
date
2013-05-27 16.920000
2013-05-28 16.920000
2013-05-29 16.920000
2013-05-30 15.862667
2013-05-31 15.862667
2013-06-01 15.862667
2013-06-02 13.563899
2013-06-03 13.563899
2013-06-04 13.563899
2013-06-05 16.207625
2013-06-06 16.207625
2013-06-07 16.207625
2013-06-08 16.207625
2013-06-09 16.207625
2013-06-10 15.697743
2013-06-11 15.697743
2013-06-12 15.697743
2013-06-13 15.697743
2013-06-14 15.697743
2013-06-15 16.070721
Freq: D, dtype: float64
如果您将其设置为列:
In [14]: df['ewma'] = pd.ewma(df["avg"], span=60, freq="D")
In [15]: df
Out[15]:
avg high low qty ewma
date
2013-05-27 16.92 19.00 1.22 71151 16.920000
2013-05-30 14.84 19.00 1.22 42939 15.862667
2013-06-02 9.19 17.20 1.23 5607 13.563899
2013-06-05 23.63 5000.00 1.22 5850 16.207625
2013-06-10 13.82 19.36 1.22 5644 15.697743
2013-06-15 17.76 24.00 2.02 16969 16.070721
答案 1 :(得分:1)
在 Pandas> 0.17 ewma中已被使用。可以通过组合ewm()
和mean()
赞:
# Calculating a few means (averages) with exponential components (com = center of mass)
# on the closing price of the Deutsche Bank stock.
import requests
import zipfile
import io # Python 2, use StringIO
import pandas as pd
import matplotlib
# Set the number of columns to be displayed when printing DataFrames
pd.set_option('max_columns', 7)
# Download file from ipfs
ipfs_file_url = "https://ipfs.io/ipfs/QmW7aSLjePW7S8uE5zbAneGAPdrzdA3MpFkTiFPrRsKS8t"
response = requests.get(ipfs_file_url, stream=True)
# The file is a zipfile to let's read it and parse the csv inside
zf = zipfile.ZipFile(io.BytesIO(response.content)) # Python 2, use StringIO.StringIO
df = pd.read_csv(zf.open('DB_20170627_to_20180627.csv'))
# Oookay, let's begin!
print(df)
# New DataFrame to keep it clean
output = pd.DataFrame()
output['Date'] = df['Date']
output['ewma_com10'] = df['Close'].ewm(com=10).mean()
output['ewma_com50'] = df['Close'].ewm(com=50).mean()
output['ewma_com100'] = df['Close'].ewm(com=100).mean()
print(output)
output.index = pd.to_datetime(output['Date'], format='%Y-%m-%d')
output.plot()
Jupyter笔记本可以在这里找到:pandas_exponential_average.ipynb