我尝试制作管道并添加我的自定义变压器,如下所示:
class DataFrameSelector(BaseEstimator, TransformerMixin):
def __init__(self, attribute_names):
self.attribute_names = attribute_names
def fit(self, X, y=None):
return self
def transform(self, X):
return X[list(self.attribute_names)]
和
class DummyTransform(BaseEstimator, TransformerMixin):
def __init__(self):
return None
def transform(self, X):
return pd.get_dummies(X).values
def fit(self, X, y=None):
return self
但是当我这样做时: RF = RandomForestClassifier(n_estimators = 100,oob_score = True,random_state = 3)
pipe= Pipeline(steps=[
('Selector', DataFrameSelector(attribute_names=('lat','long','type'))), # selects the second and 4th column
('Encoder', DummyTransform() )
('clf',RF)
])
rforest=pipe.fit(X_train,Y_train)
我遇到以下错误:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-168-108f5c7552a0> in <module>()
4 ('Selector', DataFrameSelector(attribute_names=('lat','long','type'))), # selects the second and 4th column
5 ('Encoder', DummyTransform() )
----> 6 ('clf',RF)
7 ])
8 rforest=pipe.fit(X_train,Y_train)
TypeError: 'tuple' object is not callable
为什么???
PS:奇怪的是这个作品:
RF=RandomForestClassifier(n_estimators=100,oob_score=True,random_state=3)
pipe= Pipeline(steps=[
('Selector', DataFrameSelector(attribute_names=('lat','long','type'))), # selects the second and 4th column
('Encoder', DummyTransform() )
#('clf',DecisionTreeClassifier())
])
X=pipe.fit_transform(X_train,Y_train)
RF.fit(X,Y_train)
编辑:RF代表这一行代码
RF = RandomForestClassifier(n_estimators = 100,oob_score = True,random_state = 3)
答案 0 :(得分:2)
在错误的上方一行缺少一个逗号,最后,当您评论它时,它就起作用了,因为最后一个项目缺少逗号