向数据框添加列表将1行加入

时间:2018-09-28 05:08:13

标签: python-3.x pandas list dataframe

问题

我正在使用Pandas函数,并且发现当我将列表ATR_1添加到我的数据框中时,它比我想要的晚了1行。

特别是在“输出”列ATR中,结果0.457500应该在索引行13而不是14等。

除了计算得出正确的结果之外!

故障排除

起初我虽然可能是我的数据帧和ATR_1列表之间的索引问题,但是print(i, ATR_l)在i(13)处显示了正确的ATR_1值

我还注意到ATR_1列表零中的ATR_1第一值,这是我没想到的。据我所知,这是在defineATR_l = [0] ATR_1并在ATL_1输出上引起滞后行时产生的。

当我定义一个空列表ATR_l = []时,我用df['ATR'] = ATR_lValueError: Length of values does not match length of index处抛出了一个错误

我要删除或不将此零添加到列表的第一项是什么?

仅供参考-使用Python 3.6

代码

def ATRpd():                
    data = pd.read_csv('data.txt', sep=",", header=0)
    df = data

    n = 14
    i = 0  
    TR_l = [0]  
    ATR_l = [0]

    while i < df.index[-1]:
        TR = max(df.at[i + 1, 'High'], df.at[i, 'Close']) - min(df.at[i + 1, 'Low'], df.at[i, 'Close']) 
        TR_l.append(round(TR,3))  
        i = i + 1 
    df['TR'] = TR_l
    df['MA'] = round(df.TR.rolling(n).mean(),4)

    i = 0  
    while i < df.index[-1]:        
        if i <= n - 1:
            ATR = df.at[i, 'MA']
        elif i > n - 1:               
            ATR = (ATR * 13 + df.at[i, 'TR']) / 14        
        ATR_l.append(round(ATR,6))

        if i < 20:
      #      print(i, ATR)
            print(i, ATR_l)                
        i = i + 1

    df['ATR'] = ATR_l

    print(df.head(20))

输出

   ASXCode   DateValue   Open    High    ...     Close     TR      MA       ATR
0      BHP  26/09/2016  21.47  21.670    ...     21.55  0.000     NaN  0.000000
1      BHP  27/09/2016  21.35  21.520    ...     21.50  0.380     NaN       NaN
2      BHP  28/09/2016  21.21  21.460    ...     21.39  0.295     NaN       NaN
3      BHP  29/09/2016  22.22  22.540    ...     22.40  1.150     NaN       NaN
4      BHP  30/09/2016  22.45  22.550    ...     22.38  0.440     NaN       NaN
5      BHP   3/10/2016  22.61  22.870    ...     22.75  0.490     NaN       NaN
6      BHP   4/10/2016  22.75  22.900    ...     22.90  0.200     NaN       NaN
7      BHP   5/10/2016  22.74  22.950    ...     22.85  0.280     NaN       NaN
8      BHP   6/10/2016  23.15  23.260    ...     23.12  0.410     NaN       NaN
9      BHP   7/10/2016  23.20  23.400    ...     23.30  0.400     NaN       NaN
10     BHP  10/10/2016  23.40  23.630    ...     23.40  0.330     NaN       NaN
11     BHP  11/10/2016  23.73  23.870    ...     23.80  0.470     NaN       NaN
12     BHP  12/10/2016  23.18  23.440    ...     23.44  0.790     NaN       NaN
13     BHP  13/10/2016  23.11  23.220    ...     22.75  0.770  0.4575       NaN
14     BHP  14/10/2016  22.34  22.590    ...     22.54  0.460  0.4904  0.457500
15     BHP  17/10/2016  22.35  22.620    ...     22.39  0.330  0.4868  0.457679
16     BHP  18/10/2016  22.30  22.660    ...     22.64  0.420  0.4957  0.448559
17     BHP  19/10/2016  22.50  22.530    ...     22.47  0.600  0.4564  0.446519
18     BHP  20/10/2016  22.58  23.025    ...     22.85  0.555  0.4646  0.457482
19     BHP  21/10/2016  22.96  23.260    ...     23.04  0.410  0.4589  0.464447

ATR_l的输出

0 [0, nan]
1 [0, nan, nan]
2 [0, nan, nan, nan]
3 [0, nan, nan, nan, nan]
4 [0, nan, nan, nan, nan, nan]
5 [0, nan, nan, nan, nan, nan, nan]
6 [0, nan, nan, nan, nan, nan, nan, nan]
7 [0, nan, nan, nan, nan, nan, nan, nan, nan]
8 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan]
9 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
10 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
11 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
12 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
13 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, 0.4575]
14 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, 0.4575, 0.457679]
15 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, 0.4575, 0.457679, 0.448559]
16 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, 0.4575, 0.457679, 0.448559, 0.446519]
17 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, 0.4575, 0.457679, 0.448559, 0.446519, 0.457482]
18 [0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, 0.4575, 0.457679, 0.448559, 0.446519, 0.457482,

解决方案1 ​​

基于下面YPadawan的一些技巧,我得出结论,可以在ATR计算中向+1添加df.index[-1]+1:来解决原始代码问题。

i = 0  
    while i < df.index[-1]+1:        
        if i <= n - 1:
            ATR = df.at[i, 'MA']
        elif i > n - 1:               
            ATR = (ATR * 13 + df.at[i, 'TR']) / 14        
        ATR_l.append(round(ATR,6))

解决方案2

我建议避免迭代,如果有必要,必须使用iterrows(),我将采用以下工作解决方案,该解决方案更加简洁和易于理解。

我唯一要考虑的是是否可以避免ATR迭代。

我相信我别无选择,因为我需要参考ATR的上一行来计算下一个ATR值。我对此观点是否正确?

def ATRpd2():         
    data = pd.read_csv('data.txt', sep=",", header=0)
    df = data

    n = 14

    df['Close_prev'] = df['Close'].shift(1)
    df['TR'] = df[['High', 'Close_prev']].max(axis=1) - df[['Low', 'Close_prev']].min(axis=1)         
    df['MA'] = round(df.TR.rolling(n).mean(),6)

    ATR_l = []
    for idx, row in df.iterrows():
        if idx <= n - 1:
            ATR = row['MA']          
        else:  
            ATR = (ATR * (n - 1) + row['TR']) / n        
        ATR_l.append(round(ATR,6))      
    df['ATR'] = ATR_l

    print(df.head(20))

1 个答案:

答案 0 :(得分:1)

好吧,我想这会对您有所帮助。我认为您遇到的ValueError来自while循环。

'i i df.index [-1]:'会在达到df.index [-1]之前停止,因为您使用严格的自卑感。如果您数据框的长度是10,那么我将停在9。因此,结果长度将是9,而不是10,并且在熊猫中添加长度不同于df的行数的列将引发{ {1}}。

尝试运行以下代码,以了解为什么while循环无法按预期工作:

ValueError: Length of values does not match length of index

您应该看到len(l)大于len(l2)...

实际上,我认为使用熊猫工具代替常规循环会更可取。

从数据帧开始,如果要获取变量TR,则应首先创建与“ i + 1”值“高”和“低”相对应的列。您可以使用熊猫shift method

l = list(range(11))
print(len(l))
i = 0
l2 = []
while i < l[-1]:
    l2.append(l[i])
    i+=1
print(len(l), len(l2))

要创建“ TR”列,

df['High_plus_one'] = df['High'].shift(1)
df['Low_plus_one'] = df['Low'].shift(1)

对于最后一部分,如果要创建“ ATR”列,则确实需要遍历数据框的行。您可以使用df.iterrows()方法。

df['TR'] = df[['High_plus_one', 'Close', 'Low_plus_one']].max(axis=1)

最后,您要么避免迭代熊猫数据帧,要么根据需要使用iterrows(或iteritems)方法。