我正在尝试使用scikit的ColumnTransformer类作为真正的DataFrame转换器。
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
dataset = pd.read_csv('dataset.csv')
X = dataset.iloc[:, 3:13].values
y = dataset.iloc[:, 13].values
transformer = ColumnTransformer(transformers=[("OneHot", OneHotEncoder(), [1]), ("LabEnc", LabelEncoder(), [2])], remainder="passthrough")
X = transformer.fit_transform(X)
但最后一行有错误:
TypeError: fit_transform() takes 2 positional arguments but 3 were given
此外,X是:
array([[619, 'France', 'Female', ..., 1, 1, 101348.88],
[608, 'Spain', 'Female', ..., 0, 1, 112542.58],
[502, 'France', 'Female', ..., 1, 0, 113931.57],
...,
[709, 'France', 'Female', ..., 0, 1, 42085.58],
[772, 'Germany', 'Male', ..., 1, 0, 92888.52],
[792, 'France', 'Female', ..., 1, 0, 38190.78]], dtype=object)