我正在尝试使用tf.estimator.DNNClassifier预测数据集的每月价格。但是,在运行程序时,DNNClassifier出现错误,指出标签不是<= n_classes -1。
DLookUp("[Price]", "[Prices]", "sUPC = 'UPC' And [EFFDATE] = DMax('[EFFDATE]', '[Prices]', '[EFFDATE] <= #' & Format(SomeOtherDateValue, 'yyyy\/mm\/dd') & '#')")
似乎无法识别DNN分类器中的特征列。以下代码是我正在运行的代码。
InvalidArgumentError (see above for traceback): assertion failed: [Labels must <= n_classes - 1] [Condition x <= y did not hold element-wise:x (dnn/head/ToFloat:0) = ] [[51][58][50]...] [y (dnn/head/assert_range/Const:0) = ] [1]
当我打印出feat_cols时,这就是我要得到的。
import tensorflow as tf
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
df = pd.read_csv('./csv/lmtensordata.csv')
deep_feat = df.drop(columns=['monthly_price'], axis=1)
deep_label = df['monthly_price']
df.info()
df.head()
numerical_columns = [col for col in df.columns if (df[col].dtype=='int64' or df[col].dtype=='float64')]
categorical_columns = [col for col in deep_feat.columns if len(deep_feat[col].unique()) == 2 or deep_feat[col].dtype == 'O']
continuous_columns = [col for col in deep_feat.columns if len(deep_feat[col].unique()) > 2 and (deep_feat[col].dtype == 'int64' or deep_feat[col].dtype=='float64')]
# making a train test split
X_T, X_t, y_T, y_t = train_test_split(deep_feat, deep_label, test_size=0.3)
cols_to_scale = continuous_columns[:]
# scaling the listed columns
scaler = StandardScaler()
X_T.loc[:,cols_to_scale] = scaler.fit_transform(X_T.loc[:,cols_to_scale])
X_t.loc[:,cols_to_scale] = scaler.fit_transform(X_t.loc[:,cols_to_scale])
continuous_feat_cols = [tf.feature_column.numeric_column(key=col) for col in continuous_columns]
feat_cols = continuous_feat_cols
input_fun = tf.estimator.inputs.pandas_input_fn(X_T,y_T,batch_size=50,num_epochs=1000,shuffle=True)
pred_input_fun = tf.estimator.inputs.pandas_input_fn(X_t,batch_size=50,shuffle=False)
DNN_model = tf.estimator.DNNClassifier(hidden_units=[10,10,10], feature_columns=feat_cols, n_classes=2)
DNN_model.train(input_fn=input_fun, steps=5000)
predictions = DNN_model.predict(pred_input_fun)
res_pred = list(predictions)
print(res_pred[0])
这也是我正在使用的数据集的df.head()输出。
[
_NumericColumn(key='car_year',
shape=(1,),
default_value=None,
dtype=tf.float32,
normalizer_fn=None),
_NumericColumn(key='make_model_bucket',
shape=(1,
),
default_value=None,
dtype=tf.float32,
normalizer_fn=None),
_NumericColumn(key='mileage',
shape=(1,
),
default_value=None,
dtype=tf.float32,
normalizer_fn=None),
_NumericColumn(key='years_licensed_bucket',
shape=(1,
),
default_value=None,
dtype=tf.float32,
normalizer_fn=None),
_NumericColumn(key='zip_bucket',
shape=(1,
),
default_value=None,
dtype=tf.float32,
normalizer_fn=None)
]
我不确定应该为n_class输入什么,或者我的功能列有问题。
非常感谢您的帮助!