我有一个表格数组
a = np.array([[1,2],[3,4],[5,6]])
我有一个“域”或边界,它也是
形式的数组b = np.array([[0, 4], [3,7]])
基本上我想检查a[:,0]
是否在b
的第一行内且a[:,1]
是否在b
的第二行内。例如,在此示例a[:,0]=[1,3,5]
中,我们可以看到它们都有效,除了大于4的5个。同样a[:,1] = [2,4,6]
因此我们看到2失败,因为2 <3。
所以基本上我想要0 <= a[:,0] <= 4
和3 <= a[:,1]<=7
。当一个数字超出这个边界时,我想基本上用边界内的随机数替换它。
我的尝试
a[:,0][~np.logical_and(b[0][0] <= a[:,0], a[:,0] <= b[0][1])] = np.random.uniform(b[0][0], b[0][1])
a[:,1][~np.logical_and(b[1][0] <= a[:,1], a[:,1] <= b[1][1])] = np.random.uniform(b[1][0], b[1][1])
有更快/更好的方法吗?
答案 0 :(得分:1)
方法#1:这是一种方法 -
# Invalid mask where new values are to be put
mask = (a < b[:,0]) | (a > b[:,1])
# Number of invalid ones per column of a
count = mask.sum(0)
# Get lengths for range limits set by b
lens = b[:,1] - b[:,0]
# Scale for uniform random number generation
scale = np.repeat(lens, count)
# Generate random numbers in [0,1)
rand_num = np.random.rand(count.sum())
# Get offset for each set of random numbers. Scale and add offsets to get
#equivalent of all the original code uniform rand number generation
offset = np.repeat(b[:,0], count)
put_num = rand_num*scale + offset
# Finally make a copy as a float array and assign using invalid mask
out = a.copy().astype(float)
out.T[mask.T] = put_num
示例运行 -
In [1004]: a
Out[1004]:
array([[1, 2],
[7, 4],
[5, 6]])
In [1005]: b
Out[1005]:
array([[ 2, 6],
[ 5, 12]])
In [1006]: out
Out[1006]:
array([[ 2.9488404 , 8.97938277],
[ 4.51508777, 5.69467752],
[ 5. , 6. ]])
# limits: [2, 6] [5, 12]
方法#2:另一种方法是生成与a
形状相同的缩放和偏移随机数,并简单地使用np.where
和无效掩码进行选择生成的随机数和a
。实现将更简单,如此 -
rand_nums = np.random.rand(*a.shape)*(b[:,1] - b[:,0]) + b[:,0]
mask = (a < b[:,0]) | (a > b[:,1])
out = np.where(mask, rand_nums, a)
答案 1 :(得分:0)
import numpy as np
a = np.array([[1,2],[3,4],[5,6]])
b = np.array([[0,4], [3,7]])
for iter in range(np.size(a,1)):
index = np.where(np.logical_or(a[:,iter]<b[0,iter], a[:,iter]>b[1,iter]))
if len(index)!=0:
a[index,iter] = np.random.random_integers(b[0,iter], b[1,iter], size=[len(index),1])
这应该可以满足你的需要:)