此数据框已从sql数据框下载。这是EDA的最后阶段。
无法找到编码数据框的代码中的错误。
我尝试对每个列分别进行编码,也给出了相同的错误。
Previous line:
D1.info()
Result:
<class 'pandas.core.frame.DataFrame'>
Int64Index: 35735 entries, 0 to 46605
Data columns (total 11 columns):
c_CI_Cat 35735 non-null object
c_Closure_Code 35735 non-null object
c_WBS 35735 non-null object
q_No_of_Reassignments 35735 non-null int64
q_No_of_Related_Incidents 35735 non-null float64
q_No_of_Related_Interactions 35735 non-null float64
t_Close_Time 35735 non-null datetime64[ns]
t_Open_Time 35735 non-null datetime64[ns]
t_ReopenFlag 35735 non-null float64
t_TicketWIPDurationDays 35735 non-null float64
y_Priority 35735 non-null object
dtypes: datetime64[ns](2), float64(4), int64(1), object(4)
memory usage: 3.3+ MB
Error line:
enc = LabelEncoder()
CatVarList = ['c_CI_Cat', 'c_Closure_Code', 'c_WBS', 't_ReopenFlag', 'y_Priority']
for i in CatVarList:
D1[[i]] = enc.fit_transform(D1[[i]])
Error details:
/anaconda3/lib/python3.7/site-packages/sklearn/preprocessing/label.py:235: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
y = column_or_1d(y, warn=True)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-53-4ba8389dabbc> in <module>
2 CatVarList = ['c_CI_Cat', 'c_Closure_Code', 'c_WBS','t_ReopenFlag','y_Priority']
3 for i in CatVarList:
----> 4 D1[[i]] = enc.fit_transform(D1[[i]])
5
6 D1.head()
/anaconda3/lib/python3.7/site-packages/sklearn/preprocessing/label.py in fit_transform(self, y)
234 """
235 y = column_or_1d(y, warn=True)
--> 236 self.classes_, y = _encode(y, encode=True)
237 return y
238
/anaconda3/lib/python3.7/site-packages/sklearn/preprocessing/label.py in _encode(values, uniques, encode)
106 """
107 if values.dtype == object:
--> 108 return _encode_python(values, uniques, encode)
109 else:
110 return _encode_numpy(values, uniques, encode)
/anaconda3/lib/python3.7/site-packages/sklearn/preprocessing/label.py in _encode_python(values, uniques, encode)
61 # only used in _encode below, see docstring there for details
62 if uniques is None:
---> 63 uniques = sorted(set(values))
64 uniques = np.array(uniques, dtype=values.dtype)
65 if encode:
TypeError: '<' not supported between instances of 'str' and 'int'
必须对列进行编码,以使用算法进行进一步分析。