Question

我有这个数据框：

    avg                date    high  low      qty
0 16.92 2013-05-27 00:00:00   19.00 1.22 71151.00
1 14.84 2013-05-30 00:00:00   19.00 1.22 42939.00
2  9.19 2013-06-02 00:00:00   17.20 1.23  5607.00
3 23.63 2013-06-05 00:00:00 5000.00 1.22  5850.00
4 13.82 2013-06-10 00:00:00   19.36 1.22  5644.00
5 17.76 2013-06-15 00:00:00   24.00 2.02 16969.00

每一行都是对在指定日期创建的平均值，高，低和数量的观察。

我正在尝试计算一个跨度为60天的指数移动加权平均值：

df["emwa"] = pandas.ewma(df["avg"],span=60,freq="D")

但是我得到了

TypeError: Only valid with DatetimeIndex or PeriodIndex

好吧，也许我需要在构建时将DataTimeIndex添加到我的DataFrame中。让我从

更改构造函数调用

df = pandas.DataFrame(records) #records is just a list of dictionaries

到

rng = pandas.date_range(firstDate,lastDate, freq='D')
df = pandas.DataFrame(records,index=rng)

但现在我得到了

ValueError: Shape of passed values is (5,), indices imply (5, 1641601)

有关如何计算我的EMWA的任何建议吗？

Answer 1

您需要两件事，确保日期列是日期（而不是字符串），并将索引设置为这些日期。
您可以使用to_datetime：

一次完成此操作

In [11]: df.index = pd.to_datetime(df.pop('date'))

In [12]: df
Out[12]:
              avg     high   low    qty
date
2013-05-27  16.92    19.00  1.22  71151
2013-05-30  14.84    19.00  1.22  42939
2013-06-02   9.19    17.20  1.23   5607
2013-06-05  23.63  5000.00  1.22   5850
2013-06-10  13.82    19.36  1.22   5644
2013-06-15  17.76    24.00  2.02  16969

然后您可以按预期调用emwa：

In [13]: pd.ewma(df["avg"], span=60, freq="D")
Out[13]:
date
2013-05-27    16.920000
2013-05-28    16.920000
2013-05-29    16.920000
2013-05-30    15.862667
2013-05-31    15.862667
2013-06-01    15.862667
2013-06-02    13.563899
2013-06-03    13.563899
2013-06-04    13.563899
2013-06-05    16.207625
2013-06-06    16.207625
2013-06-07    16.207625
2013-06-08    16.207625
2013-06-09    16.207625
2013-06-10    15.697743
2013-06-11    15.697743
2013-06-12    15.697743
2013-06-13    15.697743
2013-06-14    15.697743
2013-06-15    16.070721
Freq: D, dtype: float64

如果您将其设置为列：

In [14]: df['ewma'] = pd.ewma(df["avg"], span=60, freq="D")

In [15]: df
Out[15]:
              avg     high   low    qty       ewma
date
2013-05-27  16.92    19.00  1.22  71151  16.920000
2013-05-30  14.84    19.00  1.22  42939  15.862667
2013-06-02   9.19    17.20  1.23   5607  13.563899
2013-06-05  23.63  5000.00  1.22   5850  16.207625
2013-06-10  13.82    19.36  1.22   5644  15.697743
2013-06-15  17.76    24.00  2.02  16969  16.070721

Answer 2

在 Pandas> 0.17 ewma中已被使用。可以通过组合ewm()和mean()

获得相同的功能。

赞：

# Calculating a few means (averages) with exponential components (com = center of mass) 
# on the closing price of the Deutsche Bank stock.

import requests
import zipfile
import io # Python 2, use StringIO
import pandas as pd
import matplotlib

# Set the number of columns to be displayed when printing DataFrames
pd.set_option('max_columns', 7)

# Download file from ipfs
ipfs_file_url = "https://ipfs.io/ipfs/QmW7aSLjePW7S8uE5zbAneGAPdrzdA3MpFkTiFPrRsKS8t"
response = requests.get(ipfs_file_url, stream=True)

# The file is a zipfile to let's read it and parse the csv inside
zf = zipfile.ZipFile(io.BytesIO(response.content)) # Python 2, use StringIO.StringIO
df = pd.read_csv(zf.open('DB_20170627_to_20180627.csv'))

# Oookay, let's begin!
print(df)

# New DataFrame to keep it clean
output = pd.DataFrame()
output['Date'] = df['Date']
output['ewma_com10'] = df['Close'].ewm(com=10).mean()
output['ewma_com50'] = df['Close'].ewm(com=50).mean()
output['ewma_com100'] = df['Close'].ewm(com=100).mean()
print(output)

output.index = pd.to_datetime(output['Date'], format='%Y-%m-%d')
output.plot()

Jupyter笔记本可以在这里找到：pandas_exponential_average.ipynb

按时间计算DataFrame的EWMA

2 个答案: