使用 Gradient.Tape 计算 cusom 损失函数的梯度

时间:2021-03-23 06:38:17

标签: tensorflow

我正在尝试使用 Gradient.Tape 方法自定义网络训练。 这种训练是无人监督的。 网络和成本函数的详细信息如下, 我的网络是,

def CreateNetwork(inplayer, hidlayer, outlayer,seed):
    model = keras.Sequential()
    model.add(Dense(hidlayer, input_dim=inplayer, kernel_initializer=initializers.RandomNormal(mean=0.0,stddev=1/np.sqrt(inplayer),seed=seed), bias_initializer=initializers.RandomNormal(mean=0.0,stddev=1/np.sqrt(inplayer),seed=seed), activation='tanh'))
    model.add(Dense(outlayer, kernel_initializer=initializers.RandomNormal(mean=0.0,stddev=1/np.sqrt(hidlayer),seed=seed), bias_initializer=initializers.Zeros(), activation='linear'))
    return model

我的自定义成本函数定义为,

def H_tilda(J,U,nsamples,nsites,configs,out_matrix):
    EigenValue = 0.0
    for k in range(nsamples):
        config = configs[k,:]
        out_n = out_matrix[k,:]
        exp = 0.0
        for i in range(nsamples):
            n = configs[i,:]
            out_nprime = out_matrix[i,:]
            #------------------------------------------------------------------------------------------------
            #    Calculation of Hopping Term
            #------------------------------------------------------------------------------------------------
            hop = 0.0
            for j in range(nsites):
                if j == 0:
                    k = [nsites-1,j+1]
                elif j == (nsites - 1):
                    k = [j-1,0]
                else:
                    k = [j-1,j+1]
                if n[k[0]] != 0:
                    annihiliate1 = np.sqrt(n[k[0]])
                    n1 = np.copy(n)
                    n1[k[0]] = n1[k[0]] - 1
                    n1[j] = n1[j] +1
                    if (config == n1).all():
                        delta1 = 1
                    else:
                        delta1 = 0
                else:
                    annihiliate1 = 0
                    n1 = np.zeros(nsites)
                    delta1 = 0
                if n[k[1]] != 0:
                    annihiliate2 = np.sqrt(n[k[1]])
                    n2 = np.copy(n)
                    n2[k[1]] = n2[k[1]] -1
                    n2[j] = n2[j] + 1
                    if (config == n2).all():
                        delta2 = 1
                    else:
                        delta2 = 0
                else:
                    annihiliate2 = 0
                    n2 = np.zeros(nsites)
                    delta2 = 0
                create = np.sqrt(n[j] + 1)
                hop = hop + create*(annihiliate1*delta1 + annihiliate2*delta2)
            #------------------------------------------------------------------------------------------------
            
            
            #------------------------------------------------------------------------------------------------
            #    Calculation of Onsite Term
            #------------------------------------------------------------------------------------------------
            if (config == n).all():
                ons = np.sum(np.dot(np.square(n),n - 1))
            else:
                ons = 0.0
            #------------------------------------------------------------------------------------------------
            phi_value = phi(out_nprime.numpy())
            exp = exp + ((hop + ons) * phi_value)
        Phi_value = phi(out_n.numpy())
        EigenValue = EigenValue + exp/Phi_value
    return np.real(EigenValue/nsamples)

我想使用 GradientTape 方法进行自定义训练,为此我使用了以下几行,

optimizer = optimizers.SGD(learning_rate=1e-3)
with tf.GradientTape(watch_accessed_variables=False) as tape:
    tape.watch(tf.convert_to_tensor(configs))
    out_matrix = model(configs)
    print(out_matrix)
    eival = H_tilda(J,U,nsamples,nsites,configs,out_matrix)
    print(eival)
gradients = tape.gradient(tf.convert_to_tensor(eival), model.trainable_weights)
print(gradients)

但是我得到的梯度是 NONE,

output: [None, None, None, None]

0 个答案:

没有答案