Keras Estimator + tf.data API

时间:2018-12-01 00:59:45

标签: python tensorflow keras

TF 1.12:

尝试使用tf.keras.layers将预罐装估计量转换为Keras:

estimator = tf.estimator.DNNClassifier(
        model_dir='/tmp/keras',
        feature_columns=deep_columns,
        hidden_units = [100, 75, 50, 25],
        config=run_config)

使用tf.keras.layers转换为Keras模型:

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(100, activation=tf.nn.relu, input_shape=(14,)))
model.add(tf.keras.layers.Dense(75))
model.add(tf.keras.layers.Dense(50))
model.add(tf.keras.layers.Dense(25))
model.add(tf.keras.layers.Dense(1, activation=tf.nn.sigmoid))
model.compile(optimizer=tf.keras.optimizers.RMSprop(), loss=tf.keras.losses.binary_crossentropy, metrics=['accuracy'])
model.summary()
estimator = tf.keras.estimator.model_to_estimator(model, model_dir='/tmp/keras', config=run_config)

运行Keras模型时,我得到:

for n in range(40 // 2):
    estimator.train(input_fn=train_input_fn)
    results = estimator.evaluate(input_fn=eval_input_fn)

    # Display evaluation metrics
    tf.logging.info('Results at epoch %d / %d', (n + 1) * 2, 40)
    tf.logging.info('-' * 60)

我训练它时出现此错误:

主要代码:https://github.com/tensorflow/models/blob/master/official/wide_deep/census_main.py

  

KeyError:“传递到功能中的词典没有   keras模型中定义的预期输入键。\ n \ t预期键:   {'dense_50_input'} \ n \ t功能键:{'workclass','occupation',   'hours_per_week','marital_status','relationship','race','fnlwgt',   '教育','性别','资本损失','资本收益','年龄',   'education_num','native_country'} \ n \ t差异:{'workclass',   “职业”,“每小时工作时间”,“婚姻状况”,“关系”,   'dense_50_input','race','fnlwgt','education','gender',   'capital_loss','capital_gain','age','education_num',   'native_country'}“

这是我的input_fn:

def input_fn(data_file, num_epochs, shuffle, batch_size):
  """Generate an input function for the Estimator."""
  assert tf.gfile.Exists(data_file), (
      '%s not found. Please make sure you have run census_dataset.py and '
      'set the --data_dir argument to the correct path.' % data_file)

  def parse_csv(value):
    tf.logging.info('Parsing {}'.format(data_file))
    columns = tf.decode_csv(value, record_defaults=_CSV_COLUMN_DEFAULTS)
    features = dict(zip(_CSV_COLUMNS, columns))
    labels = features.pop('income_bracket')
    classes = tf.equal(labels, '>50K')  # binary classification
    return features, classes

  # Extract lines from input files using the Dataset API.
  dataset = tf.data.TextLineDataset(data_file)

  if shuffle:
    dataset = dataset.shuffle(buffer_size=_NUM_EXAMPLES['train'])

  dataset = dataset.map(parse_csv, num_parallel_calls=5)

  # We call repeat after shuffling, rather than before, to prevent separate
  # epochs from blending together.
  dataset = dataset.repeat(num_epochs)
  dataset = dataset.batch(batch_size)
  return dataset

def train_input_fn():
    return input_fn(train_file, 2, True, 40)

def eval_input_fn():
    return input_fn(test_file, 1, False, 40)

1 个答案:

答案 0 :(得分:0)

您需要添加输入层:

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.InputLayer(input_shape=your_tensor_shape, name=your_feature_key))
model.add(tf.keras.layers.Dense(100, activation=tf.nn.relu))