category_encoders:TargetEncoder错误“ TypeError:分类无法执行操作平均值”

时间:2019-07-08 07:12:13

标签: python machine-learning scikit-learn

当我尝试对类别列进行编码时出现以下错误。

“ TypeError:类别无法执行操作平均值”

当我尝试通过Jupyter Notebook运行类似的代码时,它可以正常工作,但是当我尝试将其作为python文件的一部分运行时,会出错,并显示上述错误消息。

我知道这听起来有些疯狂,但是我无法理解后台发生了什么?

错误:

TypeError                                 Traceback (most recent call last)
<ipython-input-4-9ba53c6b7375> in <module>()
----> 1 tpmodeller.initialize()

~/SageMaker/tp/tp_kvr.py in initialize(self)
    127 
    128         # Target Encode Te_cat_col Features
--> 129         df_cat_te = target_encode_bin(self.train[self.Te_cat_col], self.train['vol_trmnt_in_4_quarters'])
    130         self.train = pd.concat([self.train, df_cat_te], axis=1)
    131 

~/SageMaker/tp/tp_kvr.py in target_encode_bin(df_te, target)
    366     te = TargetEncoder(smoothing = 1, min_samples_leaf = 5, handle_unknown='ignore')
--> 367     df_te = te.fit_transform(df_te, target)
    368     # 
    369     # Binning and then placing it in {col}_bin feature

~/anaconda3/envs/python3/lib/python3.6/site-packages/category_encoders/target_encoder.py in fit_transform(self, X, y, **fit_params)
    249             transform(X)
    250         
--> 251         return self.fit(X, y, **fit_params).transform(X, y)
    252 


~/anaconda3/envs/python3/lib/python3.6/site-packages/category_encoders/target_encoder.py in fit(self, X, y, **kwargs)
    138         self.ordinal_encoder = self.ordinal_encoder.fit(X)
    139         X_ordinal = self.ordinal_encoder.transform(X)
--> 140         self.mapping = self.fit_target_encoding(X_ordinal, y)
    141 
    142         X_temp = self.transform(X, override_return_df=True)

~/anaconda3/envs/python3/lib/python3.6/site-packages/category_encoders/target_encoder.py in fit_target_encoding(self, X, y)
    164             values = switch.get('mapping')
    165 
--> 166             prior = self._mean = y.mean()
    167 
    168             stats = y.groupby(X[col]).agg(['count', 'mean'])

~/anaconda3/envs/python3/lib/python3.6/site-packages/pandas/core/generic.py in stat_func(self, axis, skipna, level, numeric_only, **kwargs)
  10954                                       skipna=skipna)
  10955         return self._reduce(f, name, axis=axis, skipna=skipna,
> 10956                             numeric_only=numeric_only)
  10957 
  10958     return set_function_name(stat_func, name, cls)

~/anaconda3/envs/python3/lib/python3.6/site-packages/pandas/core/series.py in _reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)
   3614             # TODO deprecate numeric_only argument for Categorical and use
   3615             # skipna as well, see GH25303
-> 3616             return delegate._reduce(name, numeric_only=numeric_only, **kwds)
   3617         elif isinstance(delegate, ExtensionArray):
   3618             # dispatch to ExtensionArray interface

~/anaconda3/envs/python3/lib/python3.6/site-packages/pandas/core/arrays/categorical.py in _reduce(self, name, axis, **kwargs)
   2170         if func is None:
   2171             msg = 'Categorical cannot perform the operation {op}'
-> 2172             raise TypeError(msg.format(op=name))
   2173         return func(**kwargs)
   2174 

TypeError: Categorical cannot perform the operation mean

1 个答案:

答案 0 :(得分:0)

您的目标功能需要转换为“数字”类型。