我目前正在编写scikit,学习一种热编码脚本,但是当我打印列时,只有一幅,而有四幅。下面是一种热编码和数据帧头的代码。我很困惑,因为当我在第9行中将X更改回df时,所有列都会打印,但是最后当我打印最后一行时,只会打印第一列。
Code:
import numpy as np
import pandas as pd
X = df = pd.read_csv('Filename.txt')
#print(X.head(4))
X = X.select_dtypes(include=[object])
#print(X.head(4))
print(X.shape)
from sklearn import preprocessing
print(df.columns)
le = preprocessing.LabelEncoder()
X_2 = X.apply(le.fit_transform)
#print(X_2.head())
enc = preprocessing.OneHotEncoder()
enc.fit(X_2)
onehotlabels = enc.transform(X_2).toarray()
onehotlabels.shape
#print(onehotlabels)
Dataframe head:
2019-05-02,6,9,5
2019-05-01,0,4,4
2019-04-30,5,4,4
2019-04-29,2,4,7
2019-04-28,7,5,2