自定义损失函数:值的长度与索引的长度不匹配

时间:2019-03-11 21:24:29

标签: python machine-learning scikit-learn decision-tree loss-function

对于梯度增强决策树,我实现了一个自定义损失函数,它看起来像这样(并且可以工作):

def softmax(mat):
    res = np.exp(mat)
    res = np.multiply(res, 1/np.sum(res, axis=1, keepdims=True))
    return res

def custom_asymmetric_objective(y_true, y_pred_encoded):
    pred = y_pred_encoded.reshape((-1, 3), order='F')
    pred = softmax(pred)
    y_true = OneHotEncoder(sparse=False, categories='auto').fit_transform(y_true.reshape(-1, 1))
    grad = (pred - y_true).astype("float")
    hess = 2.0 * pred * (1.0-pred)
    return grad.flatten('F'), hess.flatten('F')

现在,我想在目标函数中添加一些内容。它是通过使用现有数据帧然后添加一列来计算的,该列随后包含在损失函数中:

def custom_asymmetric_objective(y_true, y_pred_encoded):
    pred = y_pred_encoded.reshape((-1, 3), order='F')
    pred = softmax(pred)
    y_true = OneHotEncoder(sparse=False, categories='auto').fit_transform(y_true.reshape(-1, 1))
    #calculaten beta for each item in test data
    df2 = df.drop(['h', 'b','Label','w'], axis=1)
    betadf = df2.join(y_test, how = "right")
    betadf['pred']=y_pred_encoded
    overallmu = betadf['mu'].sum()
    betadf['w'] = (betadf['mu']/overallmu)
    label2value = {1: 0.11722, 2: 0.0124}
    factors = betadf['pred'].map(lambda n: label2value.get(n, 0.003))
    betadf['beta'] = betadf['w'] * (1 - ((betadf['sdL'] * factors) / betadf['muL']))
    #calculate deviance between beta and the average beta for each item
    average = 0.95/153
    betadf['penalty'] = 0
    betadf['penalty'].where(betadf['beta']-average > 0, average-betadf['beta'], inplace=True)
    pen = betadf['penalty']
    #get pen in same shape as y_true 
    pen = OneHotEncoder(sparse=False, categories='auto').fit_transform(pen.reshape(-1, 1))
    grad = (pred - y_true + pen).astype("float")
    hess = 2.0 * pred * (1.0-pred)
    return grad.flatten('F'), hess.flatten('F')

如果运行该函数,则会收到错误“值的长度与索引的长度不匹配”。我分别检查了“笔”,一切正常。所以我不知道这个错误来自哪里

0 个答案:

没有答案