在检查点中找不到Tensorflow DNN密钥

时间:2017-06-21 20:30:43

标签: python-3.x machine-learning tensorflow

我正在训练DNN来预测判刑是坏,好还是坏。

这是我的错误:

NotFoundError (see above for traceback): Key dnn/hiddenlayer_0/biases/denlayer_0/biases/part_0/Adagrad not found in checkpoint

代码:

import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'

import pandas as pd

pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)

# data = pd.read_csv('keyword_url_data.csv')        
X = ['that movie was terrible',
     'the movie was bland and terrible',
     'i loved that movie',
     'the movie was the best. i loved that movie',
     'the movie was ok but not great',
     'i thought the movie was good, maybe ok',
     ]

y = ['bad',
     'bad',
     'good',
     'good',
     'not bad',
     'not bad',
     ]

from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()
y = le.fit_transform(y)

from sklearn.feature_extraction.text import TfidfVectorizer

tfid = TfidfVectorizer()        
X = tfid.fit_transform(X)

X = X.todense()

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 42)       


import numpy as np

dims = X_train.shape[1] #cols in matrix


import tensorflow as tf

tf.logging.set_verbosity(tf.logging.ERROR)

feature_columns = [tf.contrib.layers.real_valued_column("", dimension=dims)]

n_classes = np.unique(y).size

classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
                                          hidden_units=[9,10,9],
                                          n_classes=n_classes,
                                          model_dir="/tmp/nlp_model")

def get_train_inputs():
    x = tf.constant(X_train)
    y = tf.constant(y_train)    

    return x, y

classifier.fit(input_fn=get_train_inputs, steps=100)

def get_test_inputs():
    x = tf.constant(X_test)
    y = tf.constant(y_test)

    return x, y

# Evaluate accuracy.

n_classes = np.unique(y_test).size

accuracy_score = classifier.evaluate(input_fn=get_test_inputs,
                                     steps=1)["accuracy"]

print("\nTest Accuracy: {0:f}\n".format(accuracy_score))

new_samples = ['i really thought that movie was terrible',
               'the movie was really good but parts were ok']

new_samples = tfid.transform(new_samples)
new_samples_ = new_samples.todense()

# print(new_samples)

def new_samples():
    return new_samples_#np.array(new_samples, dtype=np.float32)

predictions = list(classifier.predict_classes(input_fn=new_samples))

predictions = [le.inverse_transform(x) for x in predictions]

print(
  "New Samples, Class Predictions:    {}\n"
  .format(predictions))

当我删除我的数据的最后两行(即"不坏"标签,留下二进制标签)并将隐藏单位更改为[10,20,10]时,它可以正常工作。我不确定为什么添加第三个标签会改变任何内容。

1 个答案:

答案 0 :(得分:0)

预测时,必须与训练模型一致,这意味着您应该使用相同的标签和相同的网络结构。