给出5个向量,例如:
X1 X2
---------
A = [51, 134]
B = [40, 110]
C = [41, 191]
D = [35, 198]
E = [30, 140]
我试图找到类似的向量,例如A[X1]>B[X1]
和A[X2]>B[X2]
,我们删除B
并将A保持为“好”向量。如果A[X1]>B[X1]
和A[X2]<B[X2]
,则我们将两者都保留。我尝试使用向量之间的余弦相似度,但结果不正确。例如,上述向量将仅具有3个剩余的“好”向量A,C,D
。比较每个属性并按列排序(部分排序)是我正在考虑的一种方法。但是,如果我具有d = 10
属性怎么办?怎么解决这个问题?
答案 0 :(得分:0)
如果我的理解正确,我认为按照您对A[Xi] > B[Xi]
的意思,您实际上是指row[Xi] > next_row[Xi]
。
>>> A = [51, 134]
>>> B = [40, 110]
>>> C = [41, 191]
>>> D = [35, 198]
>>> E = [30, 140]
>>> arr = np.vstack([A, B, C, D, E])
>>> arr
array([[ 51, 134],
[ 40, 110],
[ 41, 191],
[ 35, 198],
[ 30, 140]])
>>> # (row_i[X1] > row_i+1[X1]) and (row_i[X2] > row_i+1[X2])
>>> cond1 = np.cumprod(arr[:-1] > arr[1:]).all(axis=1)
>>> cond1
array([ True, False, False, False])
>>> # (row_i[X1] > row_i+1[X1]) and (row_i[X2] < row_i+1[X2])
>>> cond2 = (arr[:-1, 0] > arr[1:, 0]) | (arr[:-1, 1] > arr[1:, 1])
>>> cond2
array([ True, False, True, True])
>>> cond1 | cond2
array([ True, False, True, True])
>>> arr[:-1][cond1 | cond2]
array([[ 51, 134], # A
[ 41, 191], # C
[ 35, 198]]) # D