试图弄清楚在使用Session训练时如何在随机森林Tensorflow中计算AUC。
尝试了许多类似此处提到的方法:
我认为我无法使用此处显示的代码来遵循上述模式:
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(input_x, input_y, test_size = 0.25, random_state = 0)
data1 = data.iloc[:,:].values
# Parameters
num_steps = 50 # Total steps to train
num_classes = 2
num_features = 14
num_trees = 10
max_nodes = 1000
# Input and Target placeholders
X = tf.placeholder(tf.float32, shape=[None, num_features])
Y = tf.placeholder(tf.int64, shape=[None])
# Random Forest Parameters
hparams = tensor_forest.ForestHParams(num_classes=num_classes, num_features=num_features, num_trees=num_trees, max_nodes=max_nodes).fill()
# Build the Random Forest
forest_graph = tensor_forest.RandomForestGraphs(hparams)
train_op = forest_graph.training_graph(X, Y)
loss_op = forest_graph.training_loss(X, Y)
infer_op, _, _ = forest_graph.inference_graph(X)
### ACCURACY DEFINITION
correct_prediction = tf.equal(tf.argmax(infer_op, 1), tf.cast(Y, tf.int64))
accuracy_op = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
### AUC DEFINITION
from sklearn.metrics import roc_auc_score
HELP HERE ?
### SESSION DEFINITION
init_vars = tf.group(tf.global_variables_initializer(), resources.initialize_resources(resources.shared_resources()))
sess = tf.Session()
# Training here
for i in range(1, num_steps + 1):
_, l = sess.run([train_op, loss_op], feed_dict={X: X_train, Y: y_train})
sess.run(tf.local_variables_initializer())
if i % 50 == 0 or i == 1:
acc = sess.run(accuracy_op, feed_dict={X: X_train, Y: y_train})
HELP HERE TO MAKE AUC AVAILABLE TOO?
print('Step %i, Loss: %f, Acc: %f' % (i, l, acc))
# EVALUATION
print("Test Accuracy:", sess.run(accuracy_op, feed_dict={X: X_test, Y: y_test}))
HELP HERE TO PRINT AUC?
有人可以帮助我了解如何在此处包括AUC Calc吗?
答案 0 :(得分:0)
这就是您需要的:
predictions=pd.DataFrame(model.predict_proba(X_test),columns=model.classes_)
roc_score = roc_auc_score((real_test_class == 'True').astype(float), predictions['True'])
其中real_test_class是保存您的真实类的向量,而projections ['True']是预测数据框中的列,其中包含每个样本的 概率 是的