我正在创建一个管道:
estimator = RandomForestClassifier(max_depth=18,n_estimators=100, n_jobs=-1)
mapper = DataFrameMapper([(i, None) if j != 'object' and j != 'bool' else (i, LabelEncoder()) for i, j in
zip(train_x.columns.values, train_x.dtypes.values)]
, input_df=True, df_out=True)
pipeline = Pipeline([("mapper", mapper),
("classifier", estimator)])
pipeline.fit(train_x, train_y)
我希望能够将其加载到s3,然后加载它进行预测。 我知道如何在本地做到这一点:
joblib.dump(pipeline, 'filename.pkl')
pkl_file = joblib.load('filename.pkl')
prediction = pkl_file.predict(train_x)
但是我如何将pickle转储到s3并从s3加载? 谢谢