我在向量化多维函数方面遇到了问题 请考虑以下示例:
def _cost(u):
return u[0] - u[1]
cost = np.vectorize(_cost)
>>> x = np.random.normal(0, 1,(10, 2))
>>> cost(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/lucapuggini/MyApps/scientific_python_3_5/lib/python3.5/site-packages/numpy/lib/function_base.py", line 2218, in __call__
return self._vectorize_call(func=func, args=vargs)
File "/Users/lucapuggini/MyApps/scientific_python_3_5/lib/python3.5/site-packages/numpy/lib/function_base.py", line 2281, in _vectorize_call
ufunc, otypes = self._get_ufunc_and_otypes(func=func, args=args)
File "/Users/lucapuggini/MyApps/scientific_python_3_5/lib/python3.5/site-packages/numpy/lib/function_base.py", line 2243, in _get_ufunc_and_otypes
outputs = func(*inputs)
TypeError: _cost() missing 1 required positional argument: 'v'
背景资料: 我在尝试将以下代码(粒子群优化算法)推广到多变量数据时遇到了问题:
import numpy as np
import matplotlib.pyplot as plt
def pso(cost, sim, space_dimension, n_particles, left_lim, right_lim, f1=1, f2=1, verbose=False):
best_scores = np.array([np.inf]*n_particles)
best_positions = np.zeros(shape=(n_particles, space_dimension))
particles = np.random.uniform(left_lim, right_lim, (n_particles, space_dimension))
velocities = np.zeros(shape=(n_particles, space_dimension))
for i in range(sim):
particles = particles + velocities
print(particles)
scores = cost(particles).ravel()
better_positions = np.argwhere(scores < best_scores).ravel()
best_scores[better_positions] = scores[better_positions]
best_positions[better_positions, :] = particles[better_positions, :]
g = best_positions[np.argmin(best_scores), :]
u1 = np.random.uniform(0, f1, (n_particles, 1))
u2 = np.random.uniform(0, f2, (n_particles, 1))
velocities = velocities + u1 * (best_positions - particles) + u2 * (g - particles)
if verbose and i % 50 == 0:
print('it=', i, ' score=', cost(g))
x = np.linspace(-5, 20, 1000)
y = cost(x)
plt.plot(x, y)
plt.plot(particles, cost(particles), 'o')
plt.vlines(g, y.min()-2, y.max())
plt.show()
return g, cost(g)
def test_pso_1_dim():
def _cost(x):
if 0 < x < 15:
return np.sin(x)*x
else:
return 15 + np.min([np.abs(x-0), np.abs(x-15)])
cost = np.vectorize(_cost)
sim = 100
space_dimension = 1
n_particles = 5
left_lim, right_lim = 0, 15
f1, f2 = 1, 1
x, cost_x = pso(cost, sim, space_dimension, n_particles,
left_lim, right_lim, f1, f2, verbose=False)
x0 = 11.0841839
assert np.abs(x - x0) < 0.01
return
如果在这种情况下矢量化不是一个好主意,请告诉我。
答案 0 :(得分:1)
如vectorize
的说明中所述:
提供矢量化功能主要是为了方便,而不是为了提高性能。实现基本上是for循环。
因此,虽然通过numpy
类型和函数对代码进行矢量化可能是一个好主意,但您可能不应该使用numpy.vectorize
来执行此操作。
对于您提供的示例,您的cost
可能会被简单有效地计算为在numpy
数组上运行的函数:
def cost(x):
# Create the empty output
output = np.empty(x.shape)
# Select the first group using a boolean array
group1 = (0 < x) & (x < 15)
output[group1] = np.sin(x[group1])*x[group1]
# Select second group as inverse (logical not) of group1
output[~group1] = 15 + np.min(
[np.abs(x[~group1]-0), np.abs(x[~group1]-15)],
axis=0)
return output
答案 1 :(得分:0)
np.vectorize
将标量提供给您的函数。例如:
In [1090]: def _cost(u):
...: return u*2
In [1092]: cost=np.vectorize(_cost)
In [1093]: cost(np.arange(10)
...: )
Out[1093]: array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18])
In [1094]: cost(np.ones((3,4)))
Out[1094]:
array([[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.]])
但是你的函数就像获取一个包含2个值的列表或数组一样。你有什么打算?
包含2个标量的函数:
In [1095]: def _cost(u,v):
...: return u+v
...:
...:
In [1096]: cost=np.vectorize(_cost)
In [1098]: cost(np.arange(3),np.arange(3,6))
Out[1098]: array([3, 5, 7])
In [1099]: cost([[1],[2]],np.arange(3,6))
Out[1099]:
array([[4, 5, 6],
[5, 6, 7]])
或使用您的2列x
:
In [1103]: cost(x[:,0],x[:,1])
Out[1103]:
array([-1.7291913 , -0.46343403, 0.61574928, 0.9864683 , -1.22373097,
1.01970917, 0.22862683, -0.11653917, -1.18319723, -3.39580376])
与在轴1上进行数组求和
相同In [1104]: x.sum(axis=1)
Out[1104]:
array([-1.7291913 , -0.46343403, 0.61574928, 0.9864683 , -1.22373097,
1.01970917, 0.22862683, -0.11653917, -1.18319723, -3.39580376])