使用Tensorflow进行K折交叉验证

时间:2017-11-20 08:24:51

标签: python machine-learning tensorflow scikit-learn cross-validation

如何在Tensorflow中的模型上实现 K-fold交叉验证?我在使用 scikit learn 之前完成了它,但没有使用Tensorflow。例如,我们说我有以下型号......

def random_forest(target, data):

# Drop the target label, which we save separately.
X = data.drop([target], axis=1).values
y = data[target].values

# Run Cross Validation on Random Forest Classifier.
clf_tree = ske.RandomForestClassifier(n_estimators=50)

然后我会通过交叉验证函数运行clf_tree ...

unique_permutations_cross_val(X, y, clf_tree)

......定义为......

def unique_permutations_cross_val(X, y, model):

# Split data 20/80 to be used in a K-Fold Cross Validation with unique permutations.
shuffle_validator = model_selection.ShuffleSplit(n_splits=10, test_size=0.2, random_state=0)

# Calculate the score of the model after Cross Validation has been applied to it. 
scores = model_selection.cross_val_score(model, X, y, cv=shuffle_validator)

# Print out the score (mean), as well as the variance.
print("Accuracy: %0.4f (+/- %0.2f)" % (scores.mean(), scores.std()))

这很容易做到。但是,让我们说我有一个工作线性回归模型定义为......

# Drop the target label, which we save separately.
X = data.drop([target], axis=1).values
Y = data[target].values

iterable_X = np.asarray(X)
iterable_Y = np.asarray(Y)

rng = np.random

n_rows = X.shape[0]

X = tf.placeholder("float")
Y = tf.placeholder("float")

W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")

pred = tf.add(tf.multiply(X, W), b)

cost = tf.reduce_sum(tf.pow(pred-Y, 2)/(2*n_rows))

optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate).minimize(cost)

init = tf.global_variables_initializer()

with tf.Session() as sess:

    sess.run(init)

    for epoch in range(FLAGS.training_epochs):

        avg_cost = 0

        for (x, y) in zip(iterable_X, iterable_Y):

            _, c = sess.run([optimizer, cost], feed_dict={X:x, Y:y})

            avg_cost += c / (n_rows/FLAGS.batch_size)

        # display logs per epoch step
        if (epoch + 1) % FLAGS.display_step == 0:

            print("Epoch:", '%04d' % (epoch + 1), "cost=", "{:.9f}".format(avg_cost))

    print("Optimization Finished!")

该模型有效,但如何使用 K-fold交叉验证实现此模型。我想将数据拆分为训练数据和测试数据,然后使用交叉验证运行模型。我怎么能这样做?

0 个答案:

没有答案