Tensorflow:将类名转换为类索引

时间:2018-02-06 20:34:26

标签: python tensorflow machine-learning dataset mapping

我正在研究具有张量流的机器学习。

问题:

我无法弄清楚如何将类名转换为类索引。

示例:

预期映射:

Car  ---> 0
Bike ---> 1
Boat ---> 2

代码:

#!/usr/bin/env python3.6

import tensorflow as tf

names = [
    "Car",
    "Bus",
    "Boat"
]

_, class_name = tf.TextLineReader(skip_header_lines=1).read(
    tf.train.string_input_producer(tf.gfile.Glob("input_file.csv"))
)

# I want to know if it is possible to do that :
#    print(sess.run(class_name)) --> "Car"
#    class_index = f(class_name, names)
#    print(sess.run(class_index)) --> 0

input_file.csv:

class_name
Car
Car
Boat
Bike
...

1 个答案:

答案 0 :(得分:1)

最简单的方法是:

class_index = tf.reduce_min(tf.where(tf.equal(names, class_name)))

请注意,它工作正常,而类存在于names中,但返回2 63 - 1,当它不存在时(如示例中的Bike) 。您可以避免此效果但删除tf.reduce_min,但在这种情况下,class_index将评估为数组,而不是标量。

完成可运行代码:

names = ["Car", "Bus", "Boat"]

_, class_name = tf.TextLineReader(skip_header_lines=1).read(
    tf.train.string_input_producer(tf.gfile.Glob("input_file.csv"))
)
class_index = tf.reduce_min(tf.where(tf.equal(names, class_name)))

with tf.Session() as session:
  coord = tf.train.Coordinator()
  threads = tf.train.start_queue_runners(coord=coord)

  for i in range(4):
    print(class_name.eval())  # Car, Car, Boat, Bike
  for i in range(4):
    print(class_index.eval()) # 0, 0, 2, 9223372036854775807

  coord.request_stop()
  coord.join(threads)