我试图获得两个时间序列与DTW之间的相关性。但我发现时间序列的幅度会影响结果。这是我的代码:
import numpy as np
import rpy2.robjects.numpy2ri
rpy2.robjects.numpy2ri.activate()
from rpy2.robjects.packages import importr
# Set up our R namespaces
R = rpy2.robjects.r
DTW = importr('dtw')
L1 = [np.sin(i*.01) for i in range(350)];
L2 = [np.sin(i*.01) + 1 for i in range(350)]
L3 = 100*[np.sin(i*.01) + 1 for i in range(350)]
alignment = R.dtw(L1, L2, keep=True)
correlation1 = alignment.rx('normalizedDistance')[0][0]
alignment = R.dtw(L1, L3, keep=True)
correlation2 = alignment.rx('normalizedDistance')[0][0]
correlation1
的值为0.4427365468841718
,correlation2
的值为0.5861839240861364
。
我想知道是否有一种简单的方法来规范化结果。 提前致谢
答案 0 :(得分:1)
一种选择是自己规范化时间序列。这使得流程移动和缩放不变。
import numpy as np
import matplotlib.pyplot as plt
def normalise(series):
max_value = max(series)
min_value = min(series)
return (series - min_value) / (max_value - min_value)
if __name__ == "__main__":
L1 = [np.sin(i*.01) for i in range(350)]
L2 = [np.sin(i*.01) + 1 for i in range(350)]
L3 = 100*[np.sin(i*.01) + 1 for i in range(350)]
norm_L1 = normalise(L1)
norm_L2 = normalise(L2)
norm_L3 = normalise(L3)
# Correlate the normalised signals
alignment = R.dtw(norm_L1, norm_L2, keep=True)
correlation1 = alignment.rx('normalizedDistance')[0][0]
alignment = R.dtw(norm_L1, norm_L3, keep=True)
correlation2 = alignment.rx('normalizedDistance')[0][0]
现在norm_L1/L2/L3
在0
和1
之间都有所不同。 norm_L2
和norm_L3
实际上是相同的,因此它们与norm_L1
的相关性保证相同。