我为Tensorflow定义了损失函数,如下所示:
def convert_sparse_matrix_to_sparse_tensor(X):
coo = X.tocoo()
indices = np.mat([coo.row, coo.col]).transpose()
return tf.SparseTensor(indices, coo.data, coo.shape)
def loss(pred, comparison_matrix):
"""
:param pred: Nx1 tensor of prediction scores sorted based on the real experimental values
:param comparison_matrix: a MxN sparse matrix of 0 and 1 for pairwise value comparison of predictions. Eg.
array([ [ 1., -1., 0., 0., 0., 0.],
[ 1., 0., -1., 0., 0., 0.],
[ 1., 0., 0., -1., 0., 0.],
[ 1., 0., 0., 0., -1., 0.],
[ 1., 0., 0., 0., 0., -1.],
[ 0., 1., -1., 0., 0., 0.],
[ 0., 1., 0., -1., 0., 0.],
[ 0., 1., 0., 0., -1., 0.],
[ 0., 1., 0., 0., 0., -1.],
[ 0., 0., 1., -1., 0., 0.],
[ 0., 0., 1., 0., -1., 0.],
[ 0., 0., 1., 0., 0., -1.],
[ 0., 0., 0., 1., -1., 0.],
[ 0., 0., 0., 1., 0., -1.],
[ 0., 0., 0., 0., 1., -1.]])
:return:
"""
N = tf.size(pred) ; # number of observations
M = comparison_matrix.shape[0] ; # number of comparisons
sparse_comparison_tensor = convert_sparse_matrix_to_sparse_tensor(comparison_matrix)
pairdiff_tensor = tf.sparse_tensor_dense_matmul(sparse_comparison_tensor, tf.transpose(pred)) - 0.000001
c = (pairdiff_tensor / tf.abs(pairdiff_tensor) - 1.0) / 2.0
C = -1.0 * tf.reduce_sum(c)
d = (pairdiff_tensor / tf.abs(pairdiff_tensor) + 1.0) / 2.0
D = tf.reduce_sum(d)
tau = (C - D) / (C + D)
tau = tf.truediv(tf.subtract(1.0, tau), 2.0);
return tau
我在方法描述中声明了有关输入参数的详细信息,但是在我的实际问题中,它们具有更高的维数。例如。 pred.shape = [774,1] comparison_matrix.shape = [2227,774]
稀疏比较矩阵已创建为
scipy.sparse.csr_matrix(comparison_matrix, dtype=float32)
使用此定义,操作可顺利运行。但是,当我按以下方法定义所示添加权重向量时,操作会慢100倍!
weights.shape = [1,2227]
def weighted_loss(pred, comparison_matrix, weights):
"""
:param pred: Nx1 tensor of prediction scores sorted based on the real experimental values
:param comparison_matrix: a MxN sparse matrix of 0 and 1 for pairwise value comparison of predictions. Eg.
array([ [ 1., -1., 0., 0., 0., 0.],
[ 1., 0., -1., 0., 0., 0.],
[ 1., 0., 0., -1., 0., 0.],
[ 1., 0., 0., 0., -1., 0.],
[ 1., 0., 0., 0., 0., -1.],
[ 0., 1., -1., 0., 0., 0.],
[ 0., 1., 0., -1., 0., 0.],
[ 0., 1., 0., 0., -1., 0.],
[ 0., 1., 0., 0., 0., -1.],
[ 0., 0., 1., -1., 0., 0.],
[ 0., 0., 1., 0., -1., 0.],
[ 0., 0., 1., 0., 0., -1.],
[ 0., 0., 0., 1., -1., 0.],
[ 0., 0., 0., 1., 0., -1.],
[ 0., 0., 0., 0., 1., -1.]])
:param weights: 1xM array that has the coefficients for each comparison
:return:
"""
N = tf.size(pred) ; # number of observations
M = comparison_matrix.shape[0] ; # number of comparisons
weights /= weights.sum() # weights must sum to 1.0
weights = tf.constant(weights, dtype=tf.float32)
sparse_comparison_tensor = convert_sparse_matrix_to_sparse_tensor(comparison_matrix)
pairdiff_tensor = tf.sparse_tensor_dense_matmul(sparse_comparison_tensor, tf.transpose(pred)) -0.000001
c = (pairdiff_tensor / tf.abs(pairdiff_tensor) - 1.0) / 2.0
C = -1.0 * tf.reduce_sum( weights.__mul__(c) )
d = (pairdiff_tensor/tf.abs(pairdiff_tensor)+1.0)/2.0
D = tf.reduce_sum( weights.__mul__(d) )
tau = (C-D)/(C+D)
tau= tf.truediv(tf.subtract(1.0, tau), 2.0)
return tau
可以看出,权重分别乘以c和d,这是由稀疏(MxN)和密集张量(Nx1)相乘得到的Mx1张量。我尝试使用整数权重(而不是现在的float32),或将sparse_comparison_tensor转换为density_comparison_tensor,但性能仍然很糟糕!我没主意了。它与加权函数的导数有关,而不与数据类型(稀疏或密集,float32或int)有关吗?也许经验丰富的程序员可以为我解决问题。
谢谢。