让我们说我有一个包含100个随机数的数组,称为random_array。我需要创建一个在random_array中平均x个数字并存储它们的数组。
所以如果我有x = 7,那么我的代码会找到前7个数字的平均值并将它们存储在我的新数组中,然后是7,然后是7 ...
我目前有这个,但我想知道如何对其进行矢量化或使用一些python方法:
random_array = np.random.randint(100, size=(100, 1))
count = 0
total = 0
new_array = []
for item in random_array:
if (count == 7):
new_array.append(total/7)
count = 0
total = 0
else:
count = count + 1
total = total + item
print new_array
答案 0 :(得分:2)
这是一种使用np.bincount
-
ids = np.arange(len(random_array))//7
out = np.bincount(ids,random_array)/np.bincount(ids)
示例运行 -
In [140]: random_array
Out[140]:
array([89, 66, 29, 25, 36, 25, 30, 58, 64, 19, 25, 63, 76, 74, 44, 73, 94,
88, 83, 88, 17, 91, 69, 65, 32, 73, 91, 20, 20, 14, 52, 65, 21, 58,
14, 30, 26, 82, 61, 87, 24, 67, 83, 93, 57, 30, 81, 48, 84, 83, 59,
19, 95, 55, 86, 57, 59, 77, 92, 44, 40, 29, 37, 42, 33, 89, 37, 57,
18, 17, 85, 47, 19, 95, 96, 40, 13, 64, 18, 79, 95, 26, 31, 70, 35,
65, 52, 93, 46, 63, 86, 77, 87, 48, 88, 62, 68, 82, 49, 86])
In [141]: ids = np.arange(len(random_array))//7
In [142]: np.bincount(ids,random_array)/np.bincount(ids)
Out[142]:
array([ 42.85714286, 54.14285714, 69.57142857, 63. ,
34.85714286, 53.85714286, 68. , 64.85714286,
54. , 41.85714286, 56.42857143, 54.71428571,
62.85714286, 73.14285714, 67.5 ])
In [143]: random_array[:7].mean() # Verify output[0]
Out[143]: 42.857142857142854
In [144]: random_array[7:14].mean() # Verify output[1]
Out[144]: 54.142857142857146
In [145]: random_array[98:].mean() # Verify output[-1]
Out[145]: 67.5
为了提高效果,我们可以使用np.bincount(ids,random_array)
-
np.add.reduceat
np.add.reduceat(random_array,range(0,len(random_array),7))
答案 1 :(得分:1)
这是标准技巧
def down_sample(x, f=7):
# pad to a multiple of f, so we can reshape
# use nan for padding, so we needn't worry about denominator in
# last chunk
xp = np.r_[x, nan + np.zeros((-len(x) % f,))]
# reshape, so each chunk gets its own row, and then take mean
return np.nanmean(xp.reshape(-1, f), axis=-1)
答案 2 :(得分:0)
你可以这样做:
random_array = np.random.randint(100, size=(100, 1))
n = 7
dummy_array = random_array
new_vector = []
ref = n
for i in np.arange(len(random_array)/n):
new_vector.append(dummy_array[i*n:ref].mean())
ref = ref + n
它会返回一个带有平均值的向量,最后一个术语是剩下的东西的平均值(最后一个序列没有必要的N个术语)
希望有所帮助
答案 3 :(得分:0)
你可以这样做:
res = np.average(np.reshape(random_array, (-1, 7)), axis=1)
...假设输入数组的大小是7的倍数。如果不能保证这一点,你可以先砍掉余数:
random_array.resize(random_array.size // 7 * 7)