计算Series或DataFrame的交叉(截距)点

时间:2012-05-07 00:38:47

标签: python pandas

我有定期数据,索引是浮点数,如下所示:

time =    [0, 0.1, 0.21, 0.31, 0.40, 0.49, 0.51, 0.6, 0.71, 0.82, 0.93]
voltage = [1,  -1,  1.1, -0.9,    1,   -1,  0.9,-1.2, 0.95, -1.1, 1.11]
df = DataFrame(data=voltage, index=time, columns=['voltage'])
df.plot(marker='o')

我想创建一个 cross(df, y_val, direction='rise' | 'fall' | 'cross') 函数,它返回一个包含所有时间(索引)的数组 插值点,其中电压值等于 y_val 。对于'上升',仅返回斜率为正的值;对于'fall',只有具有负斜率的值才会被撤消;对于'cross',都会返回。因此,如果 y_val = 0 direction ='cross',那么将返回具有10个值的数组,其中交叉点的X值(第一个约为0.025) 。

我在想这可以用迭代器完成,但是想知道是否有更好的方法来做到这一点。

感谢。我喜欢熊猫和熊猫社区。

1 个答案:

答案 0 :(得分:16)

为此,我最终得到了以下内容。它是一个矢量化版本,比使用循环的版本快150倍。

def cross(series, cross=0, direction='cross'):
    """
    Given a Series returns all the index values where the data values equal 
    the 'cross' value. 

    Direction can be 'rising' (for rising edge), 'falling' (for only falling 
    edge), or 'cross' for both edges
    """
    # Find if values are above or bellow yvalue crossing:
    above=series.values > cross
    below=np.logical_not(above)
    left_shifted_above = above[1:]
    left_shifted_below = below[1:]
    x_crossings = []
    # Find indexes on left side of crossing point
    if direction == 'rising':
        idxs = (left_shifted_above & below[0:-1]).nonzero()[0]
    elif direction == 'falling':
        idxs = (left_shifted_below & above[0:-1]).nonzero()[0]
    else:
        rising = left_shifted_above & below[0:-1]
        falling = left_shifted_below & above[0:-1]
        idxs = (rising | falling).nonzero()[0]

    # Calculate x crossings with interpolation using formula for a line:
    x1 = series.index.values[idxs]
    x2 = series.index.values[idxs+1]
    y1 = series.values[idxs]
    y2 = series.values[idxs+1]
    x_crossings = (cross-y1)*(x2-x1)/(y2-y1) + x1

    return x_crossings

# Test it out:
time = [0, 0.1, 0.21, 0.31, 0.40, 0.49, 0.51, 0.6, 0.71, 0.82, 0.93]
voltage = [1,  -1,  1.1, -0.9,    1,   -1,  0.9,-1.2, 0.95, -1.1, 1.11]
df = DataFrame(data=voltage, index=time, columns=['voltage'])
x_crossings = cross(df['voltage'])
y_crossings = np.zeros(x_crossings.shape)
plt.plot(time, voltage, '-ob', x_crossings, y_crossings, 'or')
plt.grid(True)

这很有意义。可以做出哪些改进?