分类和数值的相关矩阵不起作用

时间:2018-08-15 22:29:53

标签: python pandas scikit-learn

我正在尝试使用Label Encoder将类别列转换为整数,以创建一个由数字变量和类别变量混合而成的相关矩阵。这是我的表结构:

a   int64
b   int64
c   object
d   object
e   object
f   object
g   object
dtype: object

from sklearn import preprocessing
le = preprocessing.LabelEncoder()
for x in df.columns:
    if df[x].dtypes=='object':
       df[x]=le.fit_transform(df[x])
corr = df.corr()

然后我收到此错误:

TypeError: unorderable types: int() < str()
TypeError Traceback (most recent call last)
<command-205607> in <module>()
      3 for x in df.columns:
      4     if df[x].dtypes=='object':
----> 5        df[x]=le.fit_transform(df[x])
      6 corr = df.corr()

/databricks/python/lib/python3.5/site-packages/sklearn/preprocessing/label.py in fit_transform(self, y)
    129         y = column_or_1d(y, warn=True)
    130         _check_numpy_unicode_bug(y)
--> 131         self.classes_, y = np.unique(y, return_inverse=True)
    132         return y
    133 

/databricks/python/lib/python3.5/site-packages/numpy/lib/arraysetops.py in unique(ar, return_index, return_inverse, return_counts, axis)
    221     ar = np.asanyarray(ar)
    222     if axis is None:
--> 223         return _unique1d(ar, return_index, return_inverse, return_counts)
    224     if not (-ar.ndim <= axis < ar.ndim):
    225         raise ValueError('Invalid axis kwarg specified for unique')

/databricks/python/lib/python3.5/site-packages/numpy/lib/arraysetops.py in _unique1d(ar, return_index, return_inverse, return_counts)
    278 
    279     if optional_indices:
--> 280         perm = ar.argsort(kind='mergesort' if return_index else 'quicksort')
    281         aux = ar[perm]
    282     else:

TypeError: unorderable types: int() < str()

有人知道哪里出了问题吗?

1 个答案:

答案 0 :(得分:0)

df[x]=le.fit_transform(df[x])更改为

  

df [x] = le.fit_transform(df [x] .astype(str))

它应该工作。