构建一个tf.estimator input_fn:feature不在要素字典

时间:2017-10-09 13:35:10

标签: tensorflow

我有一组记录,代表视频游戏中的对决。我想将其提供给tf.estimator.DNNClassifier

记录包含团队0中的5位英雄和团队1中的5位英雄,游戏所在的地图以及游戏的获胜者的文本表示。我想将这三个特征表示为三个稀疏向量。

我现在没有使用熊猫或numpy。我希望暂时保持尽可能简单,直到我能够详细说明我的知识。 (但不简单!)。

提出这个问题的最佳方式可能就是在make_input_fn

显示我所拥有的内容并寻求帮助填写空白
import tensorflow as tf
import packunpack as source
import tempfile
from collections import namedtuple

GameRecord = namedtuple('GameRecord', 'team_0 team_1 game_map winner')
def parse(line):
    parts = line.rstrip().split("\t")
    return GameRecord(
        game_map = parts[1], 
        team_0 = parts[2].split(","), 
        team_1 = parts[3].split(","), 
        winner = int(parts[4]))

def conjugate(record):
    return GameRecord(
        team_0 = record.team_1, 
        team_1 = record.team_0, 
        game_map = record.game_map, 
        winner = 0 if record.winner == 1 else 1)

def sparse_team(team):
    return tf.SparseTensor(indices=team, values = [1] * len(team), dense_shape=[len(source.heroes_array)])

def sparse_map(i):
    return tf.SparseTensor(indices=[i], values = [1], dense_shape=[len(source.maps_array)])

def make_input_fn(filename, shuffle = True, add_conjugate_games = True):
    def _fn():
        records = []
        with open(filename, "r") as raw:
            i = 0
            for line in raw:
                record = parse(line)
                records.append(record)
                if add_conjugate_games:
                    # the team_0 and team_1 designations are arbitrary, and so the same inference should be drawn from a game and its "conjugate" game
                    records.append(conjugate(record))

        team_0s = map(lambda r: sparse_team(r.team_0), records)
        team_1s = map(lambda r: sparse_team(r.team_1), records)
        maps = map(lambda r: sparse_map(r.game_map), records)
        winners = map(lambda r: tf.constant([r.winner]), records)

        return ({
                    team_0: team_0s,
                    team_1: team_1s,
                    game_map: maps,
                }, 
                winners)
        #Please help me finish this function?

    return _fn

team_0 = tf.feature_column.embedding_column(
    tf.feature_column.categorical_column_with_vocabulary_list("team_0", source.heroes_array), 1)
team_1 = tf.feature_column.embedding_column(
    tf.feature_column.categorical_column_with_vocabulary_list("team_1", source.heroes_array), 1)
game_map = tf.feature_column.embedding_column(
    tf.feature_column.categorical_column_with_vocabulary_list("game_map", source.maps_array), 1)

model_dir = tempfile.mkdtemp()
m = tf.estimator.DNNClassifier(
    model_dir=model_dir,
    hidden_units = [1024, 512, 256], 
    feature_columns=[team_0, team_1, game_map])

def main():
    m.train(input_fn=make_input_fn("validation.txt"))

if __name__ == "__main__":
    main()

我今天已经遍布各个文档,但是我能找到的所有代码示例都展示了如何将pandas和numpy数据结构提供给input_fn,并通过调用helper函数来阻碍进程的底层机制。那不适合我。

(例如,https://www.tensorflow.org/get_started/input_fnhttps://www.tensorflow.org/tutorials/wide

版本1.4.0-dev20171008

当我跑步时,我得到了这个堆栈跟踪。我认为它不像_fn的返回值。但是那个字典确实有我给模型AFAICT的功能名称。

 File "estimator.py", line 72, in <module>
    main()
  File "estimator.py", line 69, in main
    m.train(input_fn=make_input_fn("validation.txt"))
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 302, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 711, in _train_model
    features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 694, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/canned/dnn.py", line 334, in _model_fn
    config=config)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/canned/dnn.py", line 190, in _dnn_model_fn
    logits = logit_fn(features=features, mode=mode)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/canned/dnn.py", line 89, in dnn_logit_fn
    features=features, feature_columns=feature_columns)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/feature_column/feature_column.py", line 230, in input_lay
er
    trainable=trainable)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/feature_column/feature_column.py", line 1834, in _get_den
se_tensor
    inputs, weight_collections=weight_collections, trainable=trainable)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/feature_column/feature_column.py", line 2119, in _get_spa
rse_tensors
    return _CategoricalColumn.IdWeightPair(inputs.get(self), None)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/feature_column/feature_column.py", line 1533, in get
    transformed = column._transform_feature(self)  # pylint: disable=protected-access
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/feature_column/feature_column.py", line 2087, in _transfo
rm_feature
    input_tensor = _to_sparse_input(inputs.get(self.key))
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/feature_column/feature_column.py", line 1529, in get
    raise ValueError('Feature {} is not in features dictionary.'.format(key))
ValueError: Feature team_0 is not in features dictionary.

1 个答案:

答案 0 :(得分:1)

我认为你应该检查你的数据并确保你缺少的字段(team_0)正确显示。可能是许多事情,例如形成错误的数据或字段名称可能在训练数据源中拼写不正确。