如何解决scikitlearn model_selection train_test_split问题?

时间:2019-09-23 00:51:00

标签: python-3.x

使用sklearn train_test_split进行火车测试拆分时,我遇到了以下错误。

from sklearn.model_selection import train_test_split X_train, y_train,X_test, y_test = train_test_split(X, y, test_size=0.2, random_state=0) NameError                                 Traceback (most recent call last) <ipython-input-17-65776283812c> in <module>
          1 from sklearn.model_selection import train_test_split
    ----> 2 X_train, y_train,X_test, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

    NameError: name 'X' is not defined

1 个答案:

答案 0 :(得分:0)

您是否声明了x数组?就像x等于什么。示例X = fake.values?。

示例代码

加载数据集

df=pd.read_csv("datasets//ccfraud.csv")
print (df.shape)
#remove the field from the data set that we dont want to include
del df['Merchant_id']
del df['Transaction_date']


#replace categorical data with one-hot encoded data
features_df = pd.get_dummies(df, columns=['Is_declined','isForeignTransaction','isHighRiskCountry'])

#remove the sale price from the feature data
del features_df['isFradulent']
features_df.columns

创建x和y数组

x=features_df.values
y=df['isFradulent'].values
print("arrays created")
print("===========Now running T T S =======")
#split the data set in a training set (70%) and a test set (30%)
x_train, x_test, y_train,y_test = train_test_split(x , y, test_size=0.3,random_state=7,shuffle=True)
print("====Train, Test, Split Shuffled===")