Question

我有一个已保存的模型（带有model.pd和变量的目录），并希望在熊猫数据框上运行预测。

我没有尝试过几种方法来做到这一点：

尝试1：从保存的模型中还原估算器

estimator = tf.estimator.LinearClassifier(
    feature_columns=create_feature_cols(),
    model_dir=path,
    warm_start_from=path)

路径是具有model.pd和变量文件夹的目录。我遇到了错误

ValueError: Tensor linear/linear_model/dummy_feature1/weights is not found in 
gs://bucket/Trainer/output/2013/20191008T170504.583379-63adee0eaee0/serving_model_dir/export/1570554483/variables/variables 
checkpoint {'linear/linear_model/dummy_feature1/weights': [1, 1], 'linear/linear_model/dummy_feature2/weights': [1, 1]
}

尝试2：通过运行

直接从保存的模型中运行预测

imported = tf.saved_model.load(path)  # path is the directory that has a `model.pd` and variables folder
imported.signatures["predict"](example)

但是尚未成功传递参数-看起来函数正在寻找tf.example，而且我不确定如何将数据帧转换为tf.example。我的转换尝试如下，但出现错误df [f]不是张量：

for f in features:
    example.features.feature[f].float_list.value.extend(df[f])

我已经看到了关于stackoverflow的解决方案，但是它们都是tensorflow 1.14。非常感谢有人可以使用tensorflow 2.0。

Answer 1

考虑到您保存的模型如下所示：

my_model
assets  saved_model.pb  variables

您可以使用以下方法加载保存的模型：

new_model = tf.keras.models.load_model('saved_model/my_model')

# Check its architecture
new_model.summary()

要对 DataFrame 执行预测，您需要：

将标量包装成一个列表，以便有一个批量维度（模型只处理批量数据，而不是单个样本）
在每个功能上调用 convert_to_tensor

示例 1： 如果您将第一个测试行的值设为

sample = {
    'Type': 'Cat',
    'Age': 3,
    'Breed1': 'Tabby',
    'Gender': 'Male',
    'Color1': 'Black',
    'Color2': 'White',
    'MaturitySize': 'Small',
    'FurLength': 'Short',
    'Vaccinated': 'No',
    'Sterilized': 'No',
    'Health': 'Healthy',
    'Fee': 100,
    'PhotoAmt': 2,
}

input_dict = {name: tf.convert_to_tensor([value]) for name, value in sample.items()}
predictions = new_model.predict(input_dict)
prob = tf.nn.sigmoid(predictions[0])

print(
    "This particular pet had a %.1f percent probability "
    "of getting adopted." % (100 * prob)
)

示例 2： 或者，如果您有多行与火车数据的顺序相同

predict_dataset = tf.convert_to_tensor([
    [5.1, 3.3, 1.7, 0.5,],
    [5.9, 3.0, 4.2, 1.5,],
    [6.9, 3.1, 5.4, 2.1]
])

# training=False is needed only if there are layers with different
# behavior during training versus inference (e.g. Dropout).
predictions = new_model(predict_dataset, training=False)

for i, logits in enumerate(predictions):
  class_idx = tf.argmax(logits).numpy()
  p = tf.nn.softmax(logits)[class_idx]
  name = class_names[class_idx]
  print("Example {} prediction: {} ({:4.1f}%)".format(i, name, 100*p))

从Tensorflow 2.0中的已保存模型运行预测

1 个答案: