我有一个矩阵M1,每行都是一个与时间有关的信号。
我有另一个相同维度的矩阵M2,其中每行也是一个时间相关的信号,用作“模板”,用于识别第一个矩阵中的信号形状。
我想要一个列向量v,其中v [i]是M1的第i行和M2的第i行之间的相关性。
我查看了numpy的corrcoef函数并尝试了以下代码:
import numpy as np
M1 = np.array ([
[1, 2, 3, 4],
[2, 3, 1, 4]
])
M2 = np.array ([
[10, 20, 30, 40],
[20, 30, 10, 40]
])
print (np.corrcoef (M1, M2))
打印:
[[ 1. 0.4 1. 0.4]
[ 0.4 1. 0.4 1. ]
[ 1. 0.4 1. 0.4]
[ 0.4 1. 0.4 1. ]]
我一直在阅读文档,但我仍然感到困惑的是,这个矩阵的哪些条目我必须选择作为我的向量v的条目。
有人可以帮忙吗?
(我已经研究了类似问题的几个S.O.答案,但还没有看到光......)
代码背景:
有256行(信号),我在'主信号'上运行了200个样本的滑动窗口,其中包含10k个样本的长度。因此M1和M2都是256行×200列。抱歉错误的10k样品。这是总信号长度。通过使用与滑动模板的相关性,我尝试找到模板匹配最佳的偏移量。实际上我正在寻找256通道侵入性心电图中的QRS复合波(或者更确切地说,电图,就像医生所说的那样)。
lg.info ('Processor: {}, time: {}, markers: {}'.format (self.key, dt.datetime.now ().time (), len (self.data.markers)))
# Compute average signal shape over preexisting markers and uses that as a template to find the others.
# All generated markers will have the width of the widest preexisting one.
template = np.zeros ((self.data.samples.shape [0], self.bufferWidthSteps))
# Add intervals that were marked in advance
nrOfTerms = 0
maxWidthSteps = 0
newMarkers = []
for marker in self.data.markers:
if marker.key == self.markerKey:
# Find start and stop sample index
startIndex = marker.tSteps - marker.stampWidthSteps // 2
stopIndex = marker.tSteps + marker.stampWidthSteps // 2
# Extract relevant slice from samples and add it to template
template += np.hstack ((self.data.samples [ : , startIndex : stopIndex], np.zeros ((self.data.samples.shape [0], self.bufferWidthSteps - marker.stampWidthSteps))))
# Adapt nr of added terms to facilitate averaging
nrOfTerms += 1
# Remember maximum width of previously marked QRS complexes
maxWidthSteps = max (maxWidthSteps, marker.stampWidthSteps)
else:
# Preexisting markers with non-matching keys are just copied to the new marker list
# Preexisting markers with a matching key are omitted from the new marker list
newMarkers.append (marker)
# Compute average of intervals that were marked in advance
template = template [ : , 0 : maxWidthSteps] / nrOfTerms
halfWidthSteps = maxWidthSteps // 2
# Append markers of intervals that yield an above threshold correlation with the averaged marked intervals
firstIndex = 0
stopIndex = self.data.samples.shape [1] - maxWidthSteps
while firstIndex < stopIndex:
corr = np.corrcoef (
template,
self.data.samples [ : , firstIndex : firstIndex + maxWidthSteps]
)
diag = np.diagonal (
corr,
template.shape [0]
)
meanCorr = np.mean (diag)
if meanCorr > self.correlationThreshold:
newMarkers.append ([self.markerFactories [self.markerKey] .make (firstIndex + halfWidthSteps, maxWidthSteps)])
# Prevent overlapping markers
firstIndex += maxWidthSteps
else:
firstIndex += 5
self.data.markers = newMarkers
lg.info ('Processor: {}, time: {}, markers: {}'.format (self.key, dt.datetime.now ().time (), len (self.data.markers)))
答案 0 :(得分:2)
基于this solution
找到两个2D
数组之间的相关矩阵,我们可以找到一个类似的数据,用于查找计算两个数组中相应行之间相关性的相关向量。实现看起来像这样 -
def corr2_coeff_rowwise(A,B):
# Rowwise mean of input arrays & subtract from input arrays themeselves
A_mA = A - A.mean(1)[:,None]
B_mB = B - B.mean(1)[:,None]
# Sum of squares across rows
ssA = (A_mA**2).sum(1);
ssB = (B_mB**2).sum(1);
# Finally get corr coeff
return np.einsum('ij,ij->i',A_mA,B_mB)/np.sqrt(ssA*ssB)
我们可以通过在那里引入ssA
魔法来进一步优化部件以获得ssB
和einsum
!
def corr2_coeff_rowwise2(A,B):
A_mA = A - A.mean(1)[:,None]
B_mB = B - B.mean(1)[:,None]
ssA = np.einsum('ij,ij->i',A_mA,A_mA)
ssB = np.einsum('ij,ij->i',B_mB,B_mB)
return np.einsum('ij,ij->i',A_mA,B_mB)/np.sqrt(ssA*ssB)
示例运行 -
In [164]: M1 = np.array ([
...: [1, 2, 3, 4],
...: [2, 3, 1, 4.5]
...: ])
...:
...: M2 = np.array ([
...: [10, 20, 33, 40],
...: [20, 35, 15, 40]
...: ])
...:
In [165]: corr2_coeff_rowwise(M1, M2)
Out[165]: array([ 0.99411402, 0.96131896])
In [166]: corr2_coeff_rowwise2(M1, M2)
Out[166]: array([ 0.99411402, 0.96131896])
运行时测试 -
In [97]: M1 = np.random.rand(256,200)
...: M2 = np.random.rand(256,200)
...:
In [98]: out1 = np.diagonal (np.corrcoef (M1, M2), M1.shape [0])
...: out2 = corr2_coeff_rowwise(M1, M2)
...: out3 = corr2_coeff_rowwise2(M1, M2)
...:
In [99]: np.allclose(out1, out2)
Out[99]: True
In [100]: np.allclose(out1, out3)
Out[100]: True
In [101]: %timeit np.diagonal (np.corrcoef (M1, M2), M1.shape [0])
...: %timeit corr2_coeff_rowwise(M1, M2)
...: %timeit corr2_coeff_rowwise2(M1, M2)
...:
100 loops, best of 3: 9.5 ms per loop
1000 loops, best of 3: 554 µs per loop
1000 loops, best of 3: 430 µs per loop
20x+
加速内置einsum
的{{1}}!
答案 1 :(得分:0)
我认为是这样的:(如果错误请更正!)
import numpy as np
M1 = np.array ([
[1, 2, 3, 4],
[2, 3, 1, 4.5]
])
M2 = np.array ([
[10, 20, 33, 40],
[20, 35, 15, 40]
])
v = np.diagonal (np.corrcoef (M1, M2), M1.shape [0])
print (v)
打印哪些:
[ 0.99411402 0.96131896]
由于它只有一个维度,我可以将其视为列向量...
答案 2 :(得分:0)
不太了解numpy数组魔法,我只是挑出行,将每一对分别送到corrcoeff
[np.corrcoef(i,j)[0][1] for i,j in zip(a,b)]
表示np.array列输出
c, c.shape = np.array([np.corrcoef(i,j)[0][1] for i,j in zip(a,b)]), (a.shape[0], 1)
我确信使用numpy广播/索引功能会更好