背景信息:我正在使用scikit-learn开发模型。我正在使用sklearn.cross_validation模块将数据拆分为单独的训练和测试集,如下所示:
def train_test_split(input_data):
from sklearn.cross_validation import train_test_split
### STEP 1: Separate y variable and remove from X
y = input_data['price']
X = input_data.copy()
X.drop('price', axis=1, inplace=True)
### STEP 2: Split into training & test sets
X_train, X_test, y_train, y_test =\
train_test_split(X, y, test_size=0.2, random_state=0)
return X_train, X_test, y_train, y_test
我的问题:当我尝试在我的函数之外导入sklearn.cross_validation模块时,我收到以下错误:
from sklearn.cross_validation import train_test_split
def train_test_split(input_data):
### STEP 1: Separate y variable and remove from X
y = input_data['price']
X = input_data.copy()
X.drop('price', axis=1, inplace=True)
### STEP 2: Split into training & test sets
X_train, X_test, y_train, y_test =\
train_test_split(X, y, test_size=0.2, random_state=0)
return X_train, X_test, y_train, y_test
错误:
TypeError: train_test_split() got an unexpected keyword argument 'test_size'
知道为什么吗?
答案 0 :(得分:4)
您正在从train_test_split
导入功能sklear.cross_validation
,然后使用您的本地功能train_test_split
覆盖该名称。
尝试:
from sklearn.cross_validation import train_test_split as sk_train_test_split
def train_test_split(input_data):
### STEP 1: Separate y variable and remove from X
y = input_data['price']
X = input_data.copy()
X.drop('price', axis=1, inplace=True)
### STEP 2: Split into training & test sets
X_train, X_test, y_train, y_test =\
sk_train_test_split(X, y, test_size=0.2, random_state=0) # use the imported function instead of local one
return X_train, X_test, y_train, y_test