Tensorflow:学习布尔特征的评估

时间:2017-02-24 14:41:23

标签: input tensorflow boolean

我想训练一个具有张量流的模型,该模型具有如下布尔特征:

data = np.array([[0,0,0],[0,0,1],[0,1,0],[1,0,0],[1,0,1],[1,1,1]], dtype=bool)

target = np.array([0,1,2,3,4,5], dtype=np.int )

对我来说看起来很简单,但事实证明,对我来说这不是一件容易的事。我不知道如何做到这一点,在网络上找不到类似的东西(除了this),我无法根据我的需要调整其中一个张量流示例。

O.K。这是代码......

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function


import tensorflow as tf
import numpy as np
import collections

# Data sets
dataComplete = np.array([[0,0,0],[0,0,1],[0,1,0],[0,1,1],[1,0,0],[1,0,1],[1,1,0],[1,1,1]], dtype=bool)
targetComplete = np.array([0,     1,      2,      3,      4,      5,      6,      7     ], dtype=np.int )

Dataset = collections.namedtuple('Dataset', ['data', 'target'])

# for trainig data, remove some data from the complete set
data = np.delete(dataComplete, [2,4,6], 0)
target = np.delete(targetComplete, [2,4,6])
training_set = Dataset(data=data, target=target)

# for test set pick some of the complete set.
data = np.array([[0,1,0], [1,0,0]], dtype=bool)
target = np.array([2,4], dtype=np.int )
test_set = Dataset(data=data, target=target)


# Specify that all features have real-value data 
# <-- This is seems to be wrong. I do not have a real valued featured, but boolean features. However
# I could not something like tf.contrib.layers.boolean_valued_column
feature_columns = [tf.contrib.layers.real_valued_column("", dimension=3)]


# Build 3 layer DNN with 10, 20, 10 units respectively.
#classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
#                                            hidden_units=[10, 20, 10],
#                                            n_classes=8,
#                                            model_dir="/tmp/chesspositions_model")
classifier = tf.contrib.learn.LinearClassifier(feature_columns)

# Fit model.
classifier.fit(x=training_set.data,
               y=training_set.target,
               steps=100)

# Evaluate accuracy.
accuracy_score = classifier.evaluate(x=test_set.data,
                                     y=test_set.target)["accuracy"]

print('Accuracy: {0:f}'.format(accuracy_score))
  1. 如图所示运行此代码会产生准确度:0.000000
  2. 当我在training_set中使用training_set中包含的数据时(例如data = np.array([[0,0,1], [0,1,1]], dtype=bool); target = np.array([1,3], dtype=np.int)),我得到的准确度为:0.500000
  3. 当我用一个DNNC分类器交换LinearClassifier时(参见代码中的注释),我得到了disjunct test_set的准确度:0.000000(见1.)
  4. 对于包含的test_set(参见2.),我得到准确度:1.000000
  5. 我不明白这些结果:-o 我原本预计第一步的准确度会有一些,每增加一步,精度达到1.0。而且我希望这种方法对于析取训练和测试集来说要慢一些。

    steps=1中的参数classifier.fit似乎没有任何效果。

0 个答案:

没有答案