我通过将图层实例列表传递给构造函数来创建Keras顺序模型。为此,我需要将input_shape参数传递给create_model()函数的第一层。通常,我可以得到一个这样的形状元组:
input_shape=(len(X_train.keys()),)
同时,我正在使用管道来处理我的预处理步骤,例如插补,缩放,编码,特征选择等。因此,预处理后的变量/特征数量与以前不一样,而且我无法获得要在此第一个隐藏层中添加的节点数。然后,我遇到了关于density_1_input的错误,此后,我可以相应地更新形状。
现在,我想知道是否有一种方法可以在使用管道时动态指定input_shape。
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.feature_selection import SelectFromModel, RFE
from sklearn.linear_model import LassoCV
numerical_transformer = Pipeline(steps=[
('imputer', SimpleImputer(strategy='mean')),
('feature_selection', SelectFromModel(LassoCV(cv=5))),
('scaler', StandardScaler()),
])
categorical_transformer = Pipeline(steps=[
('onehot', OneHotEncoder(handle_unknown='ignore')),
('imputer', SimpleImputer(strategy='most_frequent')),
('feature_selection', SelectFromModel(LassoCV(cv=5))),
])
# Bundle preprocessing for numerical and categorical data
preprocessor = ColumnTransformer(
transformers=[
('num', numerical_transformer, numerical_cols),
('cat', categorical_transformer, categorical_cols)
])
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import RMSprop
from keras.callbacks import Callback, EarlyStopping
def create_model(optimizer='adagrad',
kernel_initializer='glorot_uniform',
dropout=0.2):
model = Sequential()
model.add(Dense(64, activation='relu', kernel_initializer=kernel_initializer,
input_shape=(len(X_train.keys()),))) # len(X_train.keys()) is not correct here
model.add(Dropout(dropout))
model.add(Dense(64, activation='relu'))
model.add(Dense(1))
model.compile(loss='mean_absolute_error', optimizer=optimizer,
metrics=['mean_absolute_error'])
return model
我想要的输出是在使用管道进行预处理之后访问数据框的形状。
这可能是一个类似的未解决问题: Keras + DataFrameMapper + make_pipeline, input_dim dilemma