Question

使用tensorflow Dataset API将数据逐步加载到估算器中会返回以下错误：

ValueError: column_name: featureCategorical input_tensor dtype must be string or integer. dtype: <dtype: 'float32'>.

我正在使用的数据输入函数逐步加载数据并输出批量数据，这些数据将被摄入到估计器中。

def read_dataset(filename):
  def _input_fn():
    def decode_line(row):
      columns = tf.decode_csv(row, record_defaults = DEFAULTS) 
      features = dict(zip(["featureCategorical","featurNumeric1","featurNumeric2"], columns))
      label = features.pop('label')
      return features, label

    # Create list of file names that match "glob" pattern (i.e. data_file_*.csv)
    filenames_dataset = tf.data.Dataset.list_files(filename)
    # Read lines from text files
    textlines_dataset = filenames_dataset.flat_map(tf.data.TextLineDataset)

    # Parse text lines as comma-separated values (CSV)
    dataset = textlines_dataset.map(decode_line)
    #--->this dataset contains only floats but feature "featureCategorical" needs to be a string

    num_epochs = None
    dataset = dataset.shuffle(buffer_size = 10 * 500) 

    dataset = dataset.repeat(num_epochs).batch(batch_size)

    return dataset.make_one_shot_iterator().get_next()
  return _input_fn

所有功能都为float类型，但由于某些功能是分类功能，因此它们应为string类型。

如何在数据输入函数中仅将分类特征转换为字符串？非常感谢！

在数据输入函数中转换数据类型-Tensorlfow

0 个答案: