对于以下scikit-learn函数:train_test_split()
:
是否可以告诉功能 设置数据分割?
或换句话说:
我可以告诉功能GET http://testsymf/css/style.css net::ERR_ABORTED
应该在分割点的左侧还是右侧,<VirtualHost *:80>
DocumentRoot "C:\xampp\htdocs\test3\web\app_dev.php"
ServerName testsymf
<Directory /xampp/htdocs/test3/web>
AllowOverride All
Order Allow,Deny
Allow from All
</Directory>
应该在右侧?
(并且拆分确实以这种方式工作 - 或者只是输入数据的任意行,直到达到分割比率?)
如果无法告诉函数应该采取哪些数据进行培训和测试:是否有可用于此用例的等效替代方案?
答案 0 :(得分:4)
来自Scikit Learn文档: 将数组或矩阵拆分为随机训练和测试子集..
>>> import numpy as np
>>> from sklearn.model_selection import train_test_split
>>> X, y = np.arange(10).reshape((5, 2)), range(5)
>>> X
array([[0, 1],
[2, 3],
[4, 5],
[6, 7],
[8, 9]])
>>> list(y)
[0, 1, 2, 3, 4]
>>> X_train, X_test, y_train, y_test = train_test_split(
... X, y, test_size=0.33, random_state=42)
...
>>> X_train
array([[4, 5],
[0, 1],
[6, 7]])
>>> y_train
[2, 0, 3]
>>> X_test
array([[2, 3],
[8, 9]])
>>> y_test
[1, 4]
你也可以关掉洗牌:
>>> train_test_split(y, shuffle=False)
[[0, 1, 2], [3, 4]]
答案 1 :(得分:2)
使用KFold的解决方案如下:
import numpy as np
from sklearn.model_selection import KFold
X = np.arange(20).reshape((10, 2))
y = np.arange(20)
print(X)
print(y)
kf = KFold(n_splits=10)
for train_index, test_index in kf.split(X):
print("TRAIN size: {0:5d} from: {1:5d} to: {2:5d}".format(train_index.size, train_index[0], train_index[train_index.size - 1]))
print("TEST size: {0:5d} from: {1:5d} to: {2:5d}".format(test_index.size, test_index[0], test_index[test_index.size - 1]))
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
结果:
[[ 0 1]
[ 2 3]
[ 4 5]
[ 6 7]
[ 8 9]
[10 11]
[12 13]
[14 15]
[16 17]
[18 19]]
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]
--
TRAIN size: 9 from: 1 to: 9
TEST size: 1 from: 0 to: 0
--
TRAIN size: 9 from: 0 to: 9
TEST size: 1 from: 1 to: 1
--
TRAIN size: 9 from: 0 to: 9
TEST size: 1 from: 2 to: 2
--
TRAIN size: 9 from: 0 to: 9
TEST size: 1 from: 3 to: 3
--
TRAIN size: 9 from: 0 to: 9
TEST size: 1 from: 4 to: 4
--
TRAIN size: 9 from: 0 to: 9
TEST size: 1 from: 5 to: 5
--
TRAIN size: 9 from: 0 to: 9
TEST size: 1 from: 6 to: 6
--
TRAIN size: 9 from: 0 to: 9
TEST size: 1 from: 7 to: 7
--
TRAIN size: 9 from: 0 to: 9
TEST size: 1 from: 8 to: 8
--
TRAIN size: 9 from: 0 to: 8
TEST size: 1 from: 9 to: 9