问题
我正在使用Pandas函数,并且发现当我将列表ATR_1添加到我的数据框中时,它比我想要的晚了1行。
特别是在“输出”列ATR中,结果0.457500应该在索引行13而不是14等。
除了计算得出正确的结果之外!
故障排除
起初我虽然可能是我的数据帧和ATR_1列表之间的索引问题,但是print(i, ATR_l)
在i(13)处显示了正确的ATR_1值
我还注意到ATR_1列表零中的ATR_1第一值,这是我没想到的。据我所知,这是在define
处ATR_l = [0]
ATR_1并在ATL_1输出上引起滞后行时产生的。
当我定义一个空列表ATR_l = []
时,我用df['ATR'] = ATR_l
在ValueError: Length of values does not match length of index
处抛出了一个错误
我要删除或不将此零添加到列表的第一项是什么?
仅供参考-使用Python 3.6
代码
def ATRpd():
data = pd.read_csv('data.txt', sep=",", header=0)
df = data
n = 14
i = 0
TR_l = [0]
ATR_l = [0]
while i < df.index[-1]:
TR = max(df.at[i + 1, 'High'], df.at[i, 'Close']) - min(df.at[i + 1, 'Low'], df.at[i, 'Close'])
TR_l.append(round(TR,3))
i = i + 1
df['TR'] = TR_l
df['MA'] = round(df.TR.rolling(n).mean(),4)
i = 0
while i < df.index[-1]:
if i <= n - 1:
ATR = df.at[i, 'MA']
elif i > n - 1:
ATR = (ATR * 13 + df.at[i, 'TR']) / 14
ATR_l.append(round(ATR,6))
if i < 20:
# print(i, ATR)
print(i, ATR_l)
i = i + 1
df['ATR'] = ATR_l
print(df.head(20))
输出
ASXCode DateValue Open High ... Close TR MA ATR
0 BHP 26/09/2016 21.47 21.670 ... 21.55 0.000 NaN 0.000000
1 BHP 27/09/2016 21.35 21.520 ... 21.50 0.380 NaN NaN
2 BHP 28/09/2016 21.21 21.460 ... 21.39 0.295 NaN NaN
3 BHP 29/09/2016 22.22 22.540 ... 22.40 1.150 NaN NaN
4 BHP 30/09/2016 22.45 22.550 ... 22.38 0.440 NaN NaN
5 BHP 3/10/2016 22.61 22.870 ... 22.75 0.490 NaN NaN
6 BHP 4/10/2016 22.75 22.900 ... 22.90 0.200 NaN NaN
7 BHP 5/10/2016 22.74 22.950 ... 22.85 0.280 NaN NaN
8 BHP 6/10/2016 23.15 23.260 ... 23.12 0.410 NaN NaN
9 BHP 7/10/2016 23.20 23.400 ... 23.30 0.400 NaN NaN
10 BHP 10/10/2016 23.40 23.630 ... 23.40 0.330 NaN NaN
11 BHP 11/10/2016 23.73 23.870 ... 23.80 0.470 NaN NaN
12 BHP 12/10/2016 23.18 23.440 ... 23.44 0.790 NaN NaN
13 BHP 13/10/2016 23.11 23.220 ... 22.75 0.770 0.4575 NaN
14 BHP 14/10/2016 22.34 22.590 ... 22.54 0.460 0.4904 0.457500
15 BHP 17/10/2016 22.35 22.620 ... 22.39 0.330 0.4868 0.457679
16 BHP 18/10/2016 22.30 22.660 ... 22.64 0.420 0.4957 0.448559
17 BHP 19/10/2016 22.50 22.530 ... 22.47 0.600 0.4564 0.446519
18 BHP 20/10/2016 22.58 23.025 ... 22.85 0.555 0.4646 0.457482
19 BHP 21/10/2016 22.96 23.260 ... 23.04 0.410 0.4589 0.464447
ATR_l的输出
0 [0, nan]
1 [0, nan, nan]
2 [0, nan, nan, nan]
3 [0, nan, nan, nan, nan]
4 [0, nan, nan, nan, nan, nan]
5 [0, nan, nan, nan, nan, nan, nan]
6 [0, nan, nan, nan, nan, nan, nan, nan]
7 [0, nan, nan, nan, nan, nan, nan, nan, nan]
8 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan]
9 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
10 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
11 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
12 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
13 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, 0.4575]
14 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, 0.4575, 0.457679]
15 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, 0.4575, 0.457679, 0.448559]
16 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, 0.4575, 0.457679, 0.448559, 0.446519]
17 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, 0.4575, 0.457679, 0.448559, 0.446519, 0.457482]
18 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, 0.4575, 0.457679, 0.448559, 0.446519, 0.457482,
解决方案1
基于下面YPadawan
的一些技巧,我得出结论,可以在ATR计算中向+1
添加df.index[-1]+1:
来解决原始代码问题。
i = 0
while i < df.index[-1]+1:
if i <= n - 1:
ATR = df.at[i, 'MA']
elif i > n - 1:
ATR = (ATR * 13 + df.at[i, 'TR']) / 14
ATR_l.append(round(ATR,6))
解决方案2
我建议避免迭代,如果有必要,必须使用iterrows()
,我将采用以下工作解决方案,该解决方案更加简洁和易于理解。
我唯一要考虑的是是否可以避免ATR迭代。
我相信我别无选择,因为我需要参考ATR的上一行来计算下一个ATR值。我对此观点是否正确?
def ATRpd2():
data = pd.read_csv('data.txt', sep=",", header=0)
df = data
n = 14
df['Close_prev'] = df['Close'].shift(1)
df['TR'] = df[['High', 'Close_prev']].max(axis=1) - df[['Low', 'Close_prev']].min(axis=1)
df['MA'] = round(df.TR.rolling(n).mean(),6)
ATR_l = []
for idx, row in df.iterrows():
if idx <= n - 1:
ATR = row['MA']
else:
ATR = (ATR * (n - 1) + row['TR']) / n
ATR_l.append(round(ATR,6))
df['ATR'] = ATR_l
print(df.head(20))
答案 0 :(得分:1)
好吧,我想这会对您有所帮助。我认为您遇到的ValueError来自while循环。
'i i df.index [-1]:'会在达到df.index [-1]之前停止,因为您使用严格的自卑感。如果您数据框的长度是10,那么我将停在9。因此,结果长度将是9,而不是10,并且在熊猫中添加长度不同于df的行数的列将引发{ {1}}。
尝试运行以下代码,以了解为什么while循环无法按预期工作:
ValueError: Length of values does not match length of index
您应该看到len(l)大于len(l2)...
实际上,我认为使用熊猫工具代替常规循环会更可取。
从数据帧开始,如果要获取变量TR,则应首先创建与“ i + 1”值“高”和“低”相对应的列。您可以使用熊猫shift method。
l = list(range(11))
print(len(l))
i = 0
l2 = []
while i < l[-1]:
l2.append(l[i])
i+=1
print(len(l), len(l2))
要创建“ TR”列,
df['High_plus_one'] = df['High'].shift(1)
df['Low_plus_one'] = df['Low'].shift(1)
对于最后一部分,如果要创建“ ATR”列,则确实需要遍历数据框的行。您可以使用df.iterrows()方法。
df['TR'] = df[['High_plus_one', 'Close', 'Low_plus_one']].max(axis=1)
最后,您要么避免迭代熊猫数据帧,要么根据需要使用iterrows(或iteritems)方法。