熊猫:从另一列中选择两个特定值之间的最高和最低值

时间:2019-02-21 08:26:41

标签: python pandas

我的原始数据框如下所示:

 macd_histogram  direct    event
1.675475e-07    up  crossing up
2.299171e-07    up  0
2.246809e-07    up  0
1.760860e-07    up  0
1.899371e-07    up  0
1.543226e-07    up  0
1.394901e-07    up  0
-3.461691e-08  down crossing down
1.212740e-06    up  0
6.448285e-07    up  0
2.227792e-07    up  0
-8.738289e-08  down crossing up
-3.109205e-07  down 0

该列事件充满了crossing upcrossing down!我需要在crossing upcrossing down之间减去macd_histogram列中的最高值(在同一索引之间),然后从最低值中减去它并将其添加到新列中crossing up旁边!

我尝试使用for循环来完成此操作,但是我对如何选择每个crossing upcrossing down之间的范围有些迷惑...有什么帮助吗?谢谢!

实际上我期望的是(遵循上述数据框):

 macd_histogram  direct    event magnitude
1.675475e-07    up  crossing up (0.851908-07)
2.299171e-07    up  0
2.246809e-07    up  0
1.760860e-07    up  0
1.899371e-07    up  0
1.543226e-07    up  0
1.394901e-07    up  0
-3.461691e-08  down crossing down (2.651908-06)
1.212740e-06    up  0
6.448285e-07    up  0
2.227792e-07    up  0
-8.738289e-08  down crossing up etc..
-3.109205e-07  down 0

这是我到目前为止尝试过的:

index_up = df[df.event == 'crossing up'].index.values
index_down = df[df.event == 'crossing down'].index.values


df['magnitude'] = 0
array = np.array([])
for i in index_up:
    for idx in index_down:
        values = df.loc[i:idx, 'macd_histogram'].tolist()
        max = np.max(values)
        min = np.min(values)
        magnitutde = max-min
        print(magnitude)
       df.at[i,'magnitude'] = magnitude

但是我收到以下错误消息:ValueError: zero-size array to reduction operation maximum which has no identity

1 个答案:

答案 0 :(得分:2)

我想我理解您的要求,但是我的结果数字与您的示例不符,所以也许我不太了解。希望这个答案能对您有所帮助。

首先创建一列以放置结果。

df['result'] = np.nan

仅使用向上/向下交叉的行索引创建一个变量。

event_range = df[df['event'] != '0'].index

进行for循环以遍历索引数组。为每个部分创建一个开始和结束索引号,获取每个开始/结束索引号范围的最大值和最小值,然后减去并放在右列。

for x in range(len(event_range)-1):    
    start = event_range[x]
    end = event_range[x+1] +1 # I'm not sure if this is the range you want

    max = df.iloc[start:end, 0].max()
    min = df.iloc[start:end, 0].min()

    diff = max - min
    df.iloc[start, 3] = diff

df


    macd_histogram  direct  event             result
0   1.675480e-07    up      crossing up       2.645339e-07
1   2.299170e-07    up      0                 NaN
2   2.246810e-07    up      0                 NaN
3   1.760860e-07    up      0                 NaN
4   1.899370e-07    up      0                 NaN
5   1.543230e-07    up      0                 NaN
6   1.394900e-07    up      0                 NaN
7  -3.461690e-08    down    crossing down     1.300123e-06
8   1.212740e-06    up      0                 NaN
9   6.448290e-07    up      0                 NaN
10  2.227790e-07    up      0                 NaN
11 -8.738290e-08    down    crossing up       NaN
12 -3.109210e-07    down    0                 NaN