我自定义了CombinedAttributesAdder,因此我可以输出正确的功能工程列名称,具体取决于您是否选择包含功能,如下所示:
class CombinedAttributesAdder(BaseEstimator, TransformerMixin): # BaseEstimator erlaubt get_params und set_params, TransformerMixin fit_transform()
def __init__(self, add_bedrooms_per_room = True): # Konstruktor, übernimmt Input von außen und importiert diesen
self.add_bedrooms_per_room = add_bedrooms_per_room # Importieren der äußeren Variable
def fit(self, X, y=None):
return self # Gibt nur das DataFrame zurück
def transform(self, X, y=None): # Transformation, richtet sich nach Indizes
rooms_per_household = X[:, rooms_ix] / X[:, households_ix]
population_per_household = X[:, population_ix] / X[:, households_ix]
if self.add_bedrooms_per_room:
bedrooms_per_room = X[:, bedrooms_ix] / X[:, rooms_ix]
return np.c_[X, rooms_per_household, population_per_household,
bedrooms_per_room]
else:
return np.c_[X, rooms_per_household, population_per_household]
def columns(self):
if self.add_bedrooms_per_room:
return ["rooms_per_household", "population_per_household", "bedrooms_per_room"]
else:
return ["rooms_per_household", "population_per_household"]
attr_adder = CombinedAttributesAdder(add_bedrooms_per_room=False)
housing_extra_attribs = attr_adder.transform(housing.values)
这样,每次我调用attr_adder.columns()时,我都可以获得正确的列名,而不必一直写出来。
现在的问题是我无法保存我的模型(自定义CombineAttributesAdder是joblib的一部分),因为出现以下错误:
---------------------------------------------------------------------------
PicklingError Traceback (most recent call last)
<ipython-input-76-29425b213cb0> in <module>()
2
3 import joblib
----> 4 joblib.dump(final_model_ch2, "final_model_ch2.pkl") # DIFF
5
50 frames
/usr/lib/python3.6/pickle.py in save_global(self, obj, name)
925 raise PicklingError(
926 "Can't pickle %r: it's not the same object as %s.%s" %
--> 927 (obj, module_name, name))
928
929 if self.proto >= 2:
PicklingError: Can't pickle <class '__main__.CombinedAttributesAdder'>: it's not the same object as __main__.CombinedAttributesAdder
是否可以在保留自定义CombinedAttributesAdder的同时保存模型?