如何将汇总错误添加到keras模型? 有桌子:
g x y
0 1 1 1
1 1 2 2
2 1 3 3
3 2 1 2
4 2 2 1
我希望能够最小化sum((y - y_pred) ** 2)
错误
每组sum((sum(y) - sum(y_pred)) ** 2)
我没有更大的个人样本错误,但对我来说有正确的总数是至关重要的。
SciPy示例:
import pandas as pd
from scipy.optimize import differential_evolution
df = pd.DataFrame({'g': [1, 1, 1, 2, 2], 'x': [1, 2, 3, 1, 2], 'y': [1, 2, 3, 2, 1]})
g = df.groupby('g')
def linear(pars, fit=False):
a, b = pars
df['y_pred'] = a + b * df['x']
if fit:
sample_errors = sum((df['y'] - df['y_pred']) ** 2)
group_errors = sum((g['y'].sum() - g['y_pred'].sum()) ** 2)
total_error = sum(df['y'] - df['y_pred']) ** 2
return sample_errors + group_errors + total_error
else:
return df['y_pred']
pars = differential_evolution(linear, [[0, 10]] * 2, args=[('fit', True)])['x']
print('SAMPLES:\n', df, '\nGROUPS:\n', g.sum(), '\nTOTALS:\n', df.sum())
输出:
SAMPLES:
g x y y_pred
0 1 1 1 1.232
1 1 2 2 1.947
2 1 3 3 2.662
3 2 1 2 1.232
4 2 2 1 1.947
GROUPS:
x y y_pred
g
1 6 6 5.841
2 3 3 3.179
TOTALS:
g 7.000
x 9.000
y 9.000
y_pred 9.020
答案 0 :(得分:3)
对于分组,只要您在整个训练过程中保持相同的组,您的损失功能就不会出现无法区分的问题。
作为一种天真的分组形式,您可以简单地分开批次。
我建议使用一台发电机。
#suppose you have these three numpy arrays:
gTrain
xTrain
yTrain
#create this generator
def grouper(g,x,y):
while True:
for gr in range(1,g.max()+1):
indices = g == gr
yield (x[indices],y[indices])
对于损失功能,您可以自己创建:
import keras.backend as K
def customLoss(yTrue,yPred):
return K.sum(K.square(yTrue-yPred)) + K.sum(K.sum(yTrue) - K.sum(yPred))
model.compile(loss=customLoss, ....)
如果您有负值,请注意第二个词。
现在您使用fit_generator
方法进行训练:
model.fit_generator(grouper(gTrain,xTrain, yTrain), steps_per_epoch=gTrain.max(), epochs=...)