Question

我尝试对包含15个数字特征和4238行示例的数据集使用Logistic回归。计算的成本始于415.91，并在成本降低至仅220.119时收敛。我认为一定有问题，但是由于我不确定该怎么办，因此我想与您共享代码，这对我了解代码中不正确的地方并可能导致问题有帮助。非常感谢您的建议和经验！

import tensorflow as tf
import pandas as pd
from sklearn.model_selection import train_test_split


dataset = pd.DataFrame.from_csv('framingham_heart_disease.csv', index_col = None)
print(dataset.shape)
dataX, dataY = dataset.iloc[:,:-1], dataset.iloc[:,-1:]
dataX = dataX.values/50
dataY = dataY.values

trainX, testX, trainY, testY = train_test_split(dataX, dataY, test_size=0.20, random_state=42)

numTrainData = trainX.shape[0]
numFeatures = trainX.shape[1]
numLabels = trainY.shape[1]

X = tf.placeholder(tf.float32, [numTrainData,numFeatures])
yExpected = tf.placeholder(tf.float32, [numTrainData, numLabels])

tf.set_random_seed(1)
weights = tf.Variable(tf.random_normal([numFeatures,numLabels],
                                       mean=0,
                                       stddev=0.01,
                                       name="weights"))
bias = tf.Variable(tf.random_normal([1,numLabels],
                                    mean=0,
                                    stddev=0.01,
                                    name="bias"))

apply_weights_OP = tf.matmul(X, weights, name="apply_weights")
weights_after_nan = tf.where(tf.is_nan(apply_weights_OP), tf.ones_like(apply_weights_OP) * 0, apply_weights_OP);
add_bias_OP = tf.add(weights_after_nan, bias, name="add_bias") 
activation_OP = tf.nn.sigmoid(add_bias_OP, name="activation")

learningRate = tf.train.exponential_decay(learning_rate=0.0001,
                                          global_step= 1,
                                          decay_steps=trainX.shape[0],
                                          decay_rate= 0.95,
                                          staircase=True)
cost_OP = tf.nn.l2_loss(activation_OP-yExpected, name="squared_error_cost")
training_OP = tf.train.GradientDescentOptimizer(learningRate).minimize(cost_OP)


sess = tf.Session()
init_OP = tf.global_variables_initializer()
sess.run(init_OP)

numEpochs = 3000
cost = 0.0
diff = 1
epoch_values = []
accuracy_values = []
cost_values = []

for i in range(numEpochs):
    if i > 1 and diff < .0001:
        print("change in cost %g; convergence."%diff)
        break
    else:
        step = sess.run(training_OP, feed_dict={X: trainX, yExpected: trainY})
        # Report occasional stats
        if i % 100 == 0:
            # Add epoch to epoch_values
            epoch_values.append(i)
            # Generate accuracy stats on test data
            newCost = sess.run(cost_OP, feed_dict={X: trainX, yExpected: trainY})
            # Add cost to live graphing variable
            cost_values.append(newCost)
            # Re-assign values for variables
            diff = abs(newCost - cost)
            cost = newCost

            #generate print statements
            print("step %d, cost %g, change in cost %g"%(i, newCost, diff))

我期望以更低的成本实现更好的融合，但是我得到了：步骤0，费用415.91，费用更改415.91 步骤100，费用229.459，更改费用186.45 步骤200，费用221.717，费用变化7.74254 步骤300，费用220.504，费用更改1.2124 步骤400，成本220.225，成本变化0.279007 步骤500，费用220.15，费用变化0.0752258 步骤600，成本220.127，成本变化0.022522 步骤700，成本220.121，成本变化0.00689697 步骤800，成本220.119，成本变化0.00166321 步骤900，成本220.119，成本变化6.10352e-05 成本变更6.10352e-05；收敛。

非常感谢您的建议，祝您有个美好的一天：）

使用TensorFlow Logistic回归无法实现巨大的成本收敛

0 个答案: