在这种情况下如何使用train_test_split?

时间:2018-04-23 23:29:50

标签: scikit-learn

 from sklearn.model_selection import train_test_split
 Data1 = pd.read_csv(r"C:\Users\Zihao\Desktop\New\OBSTET.csv", index_col = 0)
 Data1.fillna(0, inplace = True) 
 Dependent = Data1.ix[:,0]
 X_train, y_train, x_test, y_test = train_test_split()

这是我的数据。我知道第一列是因变量,其余列是独立变量。

如何拆分?我不确定我应该通过哪个论点。

1 个答案:

答案 0 :(得分:2)

如果您正在尝试预测您的Dependent变量,那将是您的" y"。虽然独立变量是你的" X"。

如果是这种情况:

Dependent = Data1.ix[:, 0]    # your "y"
Independent = Data1.ix[:, 1:] # the rest of the columns (commonly refered to as "X"
X_train, x_test, y_train, y_test = train_test_split(Independent, Dependent)

这会将75%的数据放入X_train,y_train。另外25%进入x_test,y_test。