scikit遇到问题,学习一本热门的编码专栏

时间:2019-05-03 16:43:06

标签: python pandas scikit-learn one-hot-encoding

我目前正在编写scikit,学习一种热编码脚本,但是当我打印列时,只有一幅,而有四幅。下面是一种热编码和数据帧头的代码。我很困惑,因为当我在第9行中将X更改回df时,所有列都会打印,但是最后当我打印最后一行时,只会打印第一列。

Code:
import numpy as np
import pandas as pd

X = df = pd.read_csv('Filename.txt')
#print(X.head(4))

X = X.select_dtypes(include=[object])
#print(X.head(4))

print(X.shape)

from sklearn import preprocessing

print(df.columns)

le = preprocessing.LabelEncoder()

X_2 = X.apply(le.fit_transform)
#print(X_2.head())

enc = preprocessing.OneHotEncoder()

enc.fit(X_2)

onehotlabels = enc.transform(X_2).toarray()
onehotlabels.shape
#print(onehotlabels)

Dataframe head:
2019-05-02,6,9,5
2019-05-01,0,4,4
2019-04-30,5,4,4
2019-04-29,2,4,7
2019-04-28,7,5,2

0 个答案:

没有答案