Question

我正在使用scipy优化来获取以下函数的最小值：

def randomForest_b(a,b,c,d,e):
 return abs(rf_diff.predict([[a,b,c,d,e]]))

我最终希望能够在给定参数（c，d，e）的情况下获得（a）和（b）的最优值。但是，只是为了学习如何使用优化函数，我试图在给定其他参数的情况下获得（a）的最优值。我有以下代码：

res=optimize.minimize(randomForest_b, x0=45,args=(119.908500,65.517527,2.766103,29.509200), bounds=((45,65),))
print(res)

我什至尝试过：

optimize.fmin_slsqp(randomForest_b, x0=45,args=(119.908500,65.517527,2.766103,29.509200), bounds=((45,65),))

但是，这两个都只返回x0值。

Optimization terminated successfully.    (Exit mode 0)
        Current function value: 1.5458542752157667
        Iterations: 1
        Function evaluations: 3
        Gradient evaluations: 1
array([ 45.])

当前函数值正确，但是在边界内的所有数字之间，x0不会返回最小函数值。我设置了边界，因为变量a只能是45到65之间的数字。我是否缺少某些内容或做错了什么？如果可能的话，如何获得a和b的最优值？

以下是我正在使用的完整代码的示例：从numpy导入数组导入scipy.optimize为优化从scipy.optimize导入最小化

a=np.random.uniform(low=4.11, high=6.00, size=(50,))
b=np.random.uniform(low=50.11, high=55.99, size=(50,))
c=np.random.uniform(low=110.11, high=120.99, size=(50,))
d=np.random.uniform(low=50.11, high=60.00, size=(50,))
pv=np.random.uniform(low=50.11, high=60.00, size=(50,))

df=pd.DataFrame(a, columns=['a'])
df['b']=b
df['c']=c
df['d']=d
df['pv']=pv
df['difference']=df['pv']-df['d']

from sklearn.model_selection import train_test_split 
y=df.loc[:, 'difference']
x=df.iloc[:, [0,1,2,3]]
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.25)

from sklearn.ensemble import RandomForestRegressor
rf_difference = RandomForestRegressor(n_estimators = 1000, oob_score=True, 
random_state = 0)
rf_difference.fit(x_train, y_train) 

def randomForest_b(a,b,c,d):
    return abs(rf_difference.predict([[a,b,c,d]]))

res=optimize.minimize(randomForest_b, 
x0=0,args=(51.714088,110.253656,54.582179), bounds=((0,6),))
print(res)

optimize.fmin_slsqp(randomForest_b, x0=0,args= 
(51.714088,110.253656,54.582179), 
bounds=((0,6),))

Answer 1

您要最小化的功能不平滑，并且还存在多个平稳状态，可以通过将randomForest_b绘制为a的函数来看出这一点：

a = np.linspace(0,6,500)
args = 51.714088,110.253656,54.582179
vrandomForest_b = np.vectorize(randomForest_b,excluded=[1,2,3])
y_values = vrandomForest_b(a, *args)

fig, ax = plt.subplots(figsize=(8,6))
ax.plot(a, y_values, label='randomForest_b')
ax.axvline(0, label='Your start value', color='g', ls='--')
ax.set(xlabel='a', ylabel='randomForest_b');
ax.legend()

对于像您这样的非平滑函数，基于梯度的优化技术几乎肯定会失败。在这种情况下，起始值0处于一个逐渐消失的高原上，因此优化在一次迭代后立即完成。

一种解决方案是使用基于非梯度的优化方法，例如使用scipy.optimize.differential_evolution的随机最小化。这些方法的一个警告是它们通常需要更多的功能评估，并且可能需要更长的时间才能完成。

在您的问题中给出的示例案例中，这种优化方法能够找到全局最小值：

rslt = optimize.differential_evolution(vrandomForest_b,
                                       args=(51.714088,110.253656,54.582179), 
                                       bounds=[(0,6)])
print(rslt)

fig, ax = plt.subplots()
ax.plot(a, y_values, label='randomForest_b')
ax.axvline(rslt.x, label='Minimum', color='red', ls='--')
ax.legend()

 fun: 0.054257768073620746 
 message: 'Optimization terminated successfully.'
 nfev: 152
 nit: 9  success: True
 x: array([5.84335956])

Scipy Optimize仅返回x0，仅完成一次迭代

1 个答案: