使用sklearn train_test_split进行火车测试拆分时,我遇到了以下错误。
from sklearn.model_selection import train_test_split X_train, y_train,X_test, y_test = train_test_split(X, y, test_size=0.2, random_state=0) NameError Traceback (most recent call last) <ipython-input-17-65776283812c> in <module>
1 from sklearn.model_selection import train_test_split
----> 2 X_train, y_train,X_test, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
NameError: name 'X' is not defined
答案 0 :(得分:0)
您是否声明了x数组?就像x等于什么。示例X = fake.values?。
示例代码
df=pd.read_csv("datasets//ccfraud.csv")
print (df.shape)
#remove the field from the data set that we dont want to include
del df['Merchant_id']
del df['Transaction_date']
#replace categorical data with one-hot encoded data
features_df = pd.get_dummies(df, columns=['Is_declined','isForeignTransaction','isHighRiskCountry'])
#remove the sale price from the feature data
del features_df['isFradulent']
features_df.columns
x=features_df.values
y=df['isFradulent'].values
print("arrays created")
print("===========Now running T T S =======")
#split the data set in a training set (70%) and a test set (30%)
x_train, x_test, y_train,y_test = train_test_split(x , y, test_size=0.3,random_state=7,shuffle=True)
print("====Train, Test, Split Shuffled===")