python数据自动重采样

时间:2019-04-05 00:10:54

标签: python python-3.x numpy

假设我有以下数据(测量值):

enter image description here

如您所见,有很多尖点(即坡度变化很大的地方)。因此,最好在这些点周围再进行一些测量。为此,我编写了一个脚本:

  1. 我计算3个连续点的曲率: Menger曲率:https://en.wikipedia.org/wiki/Menger_curvature#Definition

  2. 然后我根据曲率决定应该重新采样哪些值。

...然后迭代直到平均曲率下降...但它不起作用,因为它上升了。你知道为什么吗?

这是完整的代码(在x值的长度达到60后停止它):

import numpy as np
import matplotlib.pyplot as plt

def curvature(A,B,C):
    """Calculates the Menger curvature fro three Points, given as numpy arrays.
    Sources:
    Menger curvature: https://en.wikipedia.org/wiki/Menger_curvature#Definition
    Area of a triangle given 3 points: https://math.stackexchange.com/questions/516219/finding-out-the-area-of-a-triangle-if-the-coordinates-of-the-three-vertices-are
    """

    # Pre-check: Making sure that the input points are all numpy arrays
    if any(x is not np.ndarray for x in [type(A),type(B),type(C)]):
        print("The input points need to be a numpy array, currently it is a ", type(A))

    # Augment Columns
    A_aug = np.append(A,1)
    B_aug = np.append(B,1)
    C_aug = np.append(C,1)

    # Caclulate Area of Triangle
    matrix = np.column_stack((A_aug,B_aug,C_aug))
    area = 1/2*np.linalg.det(matrix)

    # Special case: Two or more points are equal 
    if np.all(A == B) or  np.all(B == C):
        curvature = 0
    else:
        curvature = 4*area/(np.linalg.norm(A-B)*np.linalg.norm(B-C)*np.linalg.norm(C-A))

    # Return Menger curvature
    return curvature


def values_to_calulate(x,curvature_list, max_curvature):
    """Calculates the new x values which need to be calculated
    Middle point between the three points that were used to calculate the curvature """
    i = 0
    new_x = np.empty(0)
    for curvature in curvature_list:
        if curvature > max_curvature:
            new_x = np.append(new_x, x[i]+(x[i+2]-x[i])/3 )
        i = i+1
    return new_x


def plot(x,y, title, xLabel, yLabel):
    """Just to visualize"""

    # Plot
    plt.scatter(x,y)
    plt.plot(x, y, '-o')

    # Give a title for the sine wave plot
    plt.title(title)

    # Give x axis label for the sine wave plot
    plt.xlabel(xLabel)

    # Give y axis label for the sine wave plot
    plt.ylabel(yLabel)
    plt.grid(True, which='both')
    plt.axhline(y=0, color='k')


    # Display the sine wave
    plt.show
    plt.pause(0.05)

### STARTS HERE


# Get x values of the sine wave
x = np.arange(0, 10, 1);

# Amplitude of the sine wave is sine of a variable like time
def function(x):
    return 1+np.sin(x)*np.cos(x)**2
y = function(x)

# Plot it
plot(x,y, title='Data', xLabel='Time', yLabel='Amplitude')


continue_Loop = True

while continue_Loop == True :
    curvature_list = np.empty(0)
    for i in range(len(x)-2):
        # Get the three points
        A = np.array([x[i],y[i]])
        B = np.array([x[i+1],y[i+1]])
        C = np.array([x[i+2],y[i+2]])

        # Calculate the curvature
        curvature_value = abs(curvature(A,B,C))
        curvature_list = np.append(curvature_list, curvature_value)



    print("len: ", len(x) )
    print("average curvature: ", np.average(curvature_list))

    # Calculate the points that need to be added 
    x_new = values_to_calulate(x,curvature_list, max_curvature=0.3)

    # Add those values to the current x list:
    x = np.sort(np.append(x, x_new))

    # STOPED IT AFTER len(x) == 60
    if len(x) >= 60:
        continue_Loop = False

    # Amplitude of the sine wave is sine of a variable like time
    y = function(x)

    # Plot it
    plot(x,y, title='Data', xLabel='Time', yLabel='Amplitude')

这是它的外观:

enter image description here

编辑:

如果让它运行得更远...:

enter image description here

1 个答案:

答案 0 :(得分:2)

因此,请总结以上我的评论:

  • 您正在计算曲线的平均曲率,没有理由将其设为0。在每个点上,无论您的点有多近,圆半径都会收敛到该点处的曲率,不是0。

  • 一种替代方法是使用两点之间的绝对导数变化:持续采样直到abs(d(df/dx)) < some_threshold,其中d(df/dx) = (df/dx)[n] - (df/dx)[n-1]