为什么我的逻辑回归的准确性这么小?

时间:2019-04-05 17:12:40

标签: python tensorflow machine-learning

我已经编写了有关训练逻辑回归模型的代码,如下所示

这是有关设置参数的代码

TOTAL_CLASS = 3
LABEL_DICT = {"setosa": 0, "versicolor": 1, "virginica": 2}
BATCH_SIZE = 5
TOTAL_RECORD=150

这是有关加载虹膜数据集的代码:

def loadData(path, batchsize, label_name="species"):
    rawData = tf.contrib.data.make_csv_dataset(path, label_name=label_name, batch_size=batchsize);
    return rawData

以下是有关创建逻辑回归网络的代码:

def logistic_layer(inputs, size):
    weight_variable = tf.Variable(tf.truncated_normal(shape=(inputs.shape.as_list()[1], size), stddev=0.1))
    bias = tf.Variable(tf.constant(0.01, dtype=tf.float32), trainable=False)
    temp = tf.matmul(inputs, weight_variable)+bias
    return tf.nn.sigmoid(temp)

以下是将原始标签更改为一个热门矢量的代码

def make_set(features, labels):
    feature_data = []
    final_labels = []

    def get_one_hot(num, depth):
        temp = np.zeros(depth);
        temp[num] = 1
        return temp;

    for _, item in features.items():
        feature_data.append(item)

    for i in range(len(labels)):
        labels[i] = LABEL_DICT[labels[i].decode("utf-8")]
        final_labels.append(get_one_hot(labels[i], TOTAL_CLASS))       

    feature_data = np.transpose(feature_data)

    return feature_data, final_labels

这是主要功能,训练模型

def training(data_source=""):
    # load Data
    rawData = loadData(data_source, BATCH_SIZE)
    iterator = rawData.make_initializable_iterator()
    next_batch = iterator.get_next()

    # set up network
    x = tf.placeholder(tf.float32, shape=(None, 4))
    y_ = tf.placeholder(tf.float32, shape=(None, TOTAL_CLASS))
    y_predict = logistic_layer(x, TOTAL_CLASS)

    # set up loss function
    cross_entropy = tf.losses.log_loss(predictions=y_predict, labels=y_)
    global_step = tf.Variable(0, trainable=False)
    learning_rate = tf.train.exponential_decay(1e-1, global_step, 2, 0.96, staircase=True)
    train_step = tf.train.AdamOptimizer(learning_rate).minimize(cross_entropy)
    correct_prediction = tf.equal(tf.argmax(y_predict, 1), tf.argmax(y_, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    tf.summary.scalar('cross_entropy', cross_entropy)
    tf.summary.scalar('accuracy', accuracy)

    merged = tf.summary.merge_all()

    with tf.Session() as sess:
        sess.run(iterator.initializer)
        sess.run(tf.global_variables_initializer())
        train_writer = tf.summary.FileWriter('./train', sess.graph)
        test_writer = tf.summary.FileWriter('./test', sess.graph)
        total_train_accuracy, test_accuracy = 0, 0
        for i in range(int(TOTAL_RECORD*0.7/BATCH_SIZE)):
            x_temp, y_temp = sess.run(next_batch);
            x_train, y_train = make_set(x_temp, y_temp)
            sess.run(train_step, feed_dict={x: x_train, y_: y_train})
            if i%2 == 0:
                summary, train_accuracy = sess.run([merged, accuracy], feed_dict={x: x_train, y_: y_train})
                total_train_accuracy += train_accuracy
                train_writer.add_summary(summary, int(i/2))
                print("step {}, training accuracy {}".format(int(i/2), train_accuracy))
        print("-----------margin------------")
        print("total train accuracy: {}".format(total_train_accuracy/int(i/2)))
        for i in range(int(TOTAL_RECORD*0.3/BATCH_SIZE)):
            x_temp, y_temp = sess.run(next_batch);
            x_test, y_test = make_set(x_temp, y_temp)
            test_summary, test_accuracy = sess.run([merged, accuracy], feed_dict={x: x_test, y_: y_test})
            test_writer.add_summary(test_summary, i)

一旦在本地目录中使用 iris数据集的路径运行training方法,测试的准确性就永远不会达到90%,我想知道我的代码中是否存在错误以及如何解决它,谢谢!

1 个答案:

答案 0 :(得分:0)

您可以尝试在模型中添加更多层,而我在模型中仅找到一层。也许您的模型不合适。