我有以下数据数组 m :
import numpy as np
a = [[1],[0],[1],[0],[0]]
b = [[1],[0],[1],[0],[0]]
c = d = [[1],[0],[1],[0],[0]]
m = np.hstack((a,b,c,d))
m
array([[1, 0, 1, 1],
[0, 0, 0, 0],
[1, 1, 1, 1],
[0, 0, 0, 0],
[0, 1, 0, 0]])
我有以下向量之前
prior = [0.1,0.2,0.3,0.4]
我现在要创建一个长度为5的新向量,其中 m 的每一行按照此方案求和
如果1则加1 /先前
如果为0,则添加0.1 * 1 /先前
所以对于 m 的第一行,我们会得到
(1/0.1)+(0.1*1/0.2)+(1/0.3)+(1/0.4) = 16.33
第二行是
(0.1*1/0.1)+(0.1*1/0.2)+(0.1*1/0.3)+(0.1*1/0.4) = 2.083
m 应该是基础,可以使用numpy(也许是.sum(axis = 1))?
更新:
我也对 m 可能需要两个以上不同整数的解决方案感兴趣。例如,我想要m==2
的第三个规则,并将这些值添加到0.2 * 1 /之前
答案 0 :(得分:4)
由于您已经在使用numpy
,我建议使用numpy.where
和numpy.sum
。请注意,仅当您prior
为numpy.array
时才有效。
p = np.asarray(prior)
np.sum(np.where(m,1./p,0.1/p),axis=1)
# array([ 16.33333333, 2.08333333, 20.83333333, 2.08333333, 6.58333333])
注意强>
np.where
通常需要一个bools
数组。但是,当您提供integers
列表时,号码0
被解释为False
,其他所有内容都被解释为True
<强>更新强>
如果您要在2
中为m
的出现添加第三条规则,我会使用np.choose
代替np.where
。如果您希望0.2/p
出现2
,则可以执行
p = np.asarray(prior)
p_vec = np.vstack((0.1/p,1./p,0.2/p))
np.choose(m,p_vec).sum(axis=1)
我们的想法是首先创建一个包含p_vec
,0.1/p
和1./p
的列表0.2/p
。命令np.choose
根据m
从列表中选择相应的实体。
这可以轻松扩展为整数3,4,...
,只需将相应的数据添加到p_vec
。
答案 1 :(得分:3)
方法#1:使用boolean indexing
-
# Calculate the reciprocal of prior as a numpy array
prior_reci = 1/np.asarray(prior)
# Mask of ones (1s) in array, m
mask = m==1
# Use the mask for m==1 and otherwise with proper scales: prior_reci
# and 0.1*prior_reci respectively and sum them up along the rows
out = (mask*prior_reci + ~mask*(0.1*prior_reci)).sum(1)
示例运行 -
In [58]: m
Out[58]:
array([[1, 0, 1, 1],
[0, 0, 0, 0],
[1, 1, 1, 1],
[0, 0, 0, 0],
[0, 1, 0, 0]])
In [59]: prior
Out[59]: [0.1, 0.2, 0.3, 0.4]
In [60]: prior_reci = 1/np.asarray(prior)
...: mask = m==1
...:
In [61]: (mask*prior_reci + ~mask*(0.1*prior_reci)).sum(1)
Out[61]: array([ 16.33333333, 2.08333333, 20.83333333, 2.08333333, 6.58333333])
方法#2:使用matrix-multiplication with np.dot
-
# Calculate the reciprocal of prior as a numpy array
prior_reci = 1/np.asarray(prior)
# Sum along rows for m==1 with scaling of prior_reci per row
# would be equivalent to np.dot(m,prior_reci).
# Similarly for m!=1, it would be np.dot(1-m,0.1*prior_reci)
# i.e. with the new scaling 0.1*prior_reci.
# Finally we need to combine them up with summation.
out = np.dot(m,prior_reci) + np.dot(1-m,0.1*prior_reci)
示例运行 -
In [77]: m
Out[77]:
array([[1, 0, 1, 1],
[0, 0, 0, 0],
[1, 1, 1, 1],
[0, 0, 0, 0],
[0, 1, 0, 0]])
In [78]: prior
Out[78]: [0.1, 0.2, 0.3, 0.4]
In [79]: prior_reci = 1/np.asarray(prior)
In [80]: np.dot(m,prior_reci) + np.dot(1-m,0.1*prior_reci)
Out[80]: array([ 16.33333333, 2.08333333, 20.83333333, 2.08333333, 6.58333333])
运行时测试比较前面列出的两种方法 -
In [102]: # Parameters
...: H = 1000
...: W = 1000
...:
...: # Create inputs
...: m = np.random.randint(0,2,(H,W))
...: prior = np.random.rand(W).tolist()
...:
In [103]: %%timeit
...: prior_reci1 = 1/np.asarray(prior)
...: mask = m==1
...: out1 = (mask*prior_reci1 + ~mask*(0.1*prior_reci1)).sum(1)
...:
100 loops, best of 3: 11.1 ms per loop
In [104]: %%timeit
...: prior_reci2 = 1/np.asarray(prior)
...: out2 = np.dot(m,prior_reci2) + np.dot(1-m,0.1*prior_reci2)
...:
100 loops, best of 3: 6 ms per loop
处理多个条件检查的 通用解决方案可以使用np.einsum
-
# Define scalars that are to be matched against input 2D array, m
matches = [0,1,2,3,4] # Edit this to accomodate more matching conditions
# Define multiplying factors for the reciprocal version of prior
prior_multfactors = [0.1,1,0.2,0.3,0.4] # Edit this corresponding to matches
# for different multiplying factors
# Thus, for the given matches and prior_multfactors, it means:
# when m==0, then do: 0.1/prior
# when m==1, then do: 1/prior
# when m==2, then do: 0.2/prior
# when m==3, then do: 0.3/prior
# when m==4, then do: 0.4/prior
# Define prior list
prior = [0.1,0.2,0.3,0.4]
# Calculate the reciprocal of prior as a numpy array
prior_reci = 1/np.asarray(prior)
# Mask for every element of m satisfying or not
# all the matches to produce a 3D array mask
mask = m==np.asarray(matches)[:,None,None]
# Get scaling factors for each matches across each prior_reci value
scales = np.asarray(prior_multfactors)[:,None]*prior_reci
# Einsum-mation to give sum across rows corresponding to all matches
out = np.einsum('ijk,ik->j',mask,scales)
示例运行 -
In [203]: m
Out[203]:
array([[1, 0, 1, 1],
[0, 0, 0, 0],
[4, 2, 3, 1],
[0, 0, 0, 0],
[0, 4, 2, 0]])
In [204]: matches, prior_multfactors
Out[204]: ([0, 1, 2, 3, 4], [0.1, 1, 0.2, 0.3, 0.4])
In [205]: prior
Out[205]: [0.1, 0.2, 0.3, 0.4]
In [206]: prior_reci = 1/np.asarray(prior)
...: mask = m==np.asarray(matches)[:,None,None]
...: scales = np.asarray(prior_multfactors)[:,None]*prior_reci
...:
In [207]: np.einsum('ijk,ik->j',mask,scales)
Out[207]: array([ 16.33333333, 2.08333333, 8.5 , 2.08333333, 3.91666667])
答案 2 :(得分:1)
不是说它比Divikar更好,只是替代方案:
prior_reci = 1/np.asarray(prior)
(a * prior_reci + (1 - a)*prior_reci/10).sum(axis=1)