我有4个文件:train.txt,trainLabel.txt,test.txt,testLabel.txt
train.txt
1,60,feature_col0,feature_col1,feature_col2,feature_col3,feature_col4,feature_col5,feature_col6,feature_col7,feature_col8,feature_col9,feature_col10,feature_col11,feature_col12,feature_col13,feature_col14,feature_col15,feature_col16,feature_col17,feature_col18,feature_col19,feature_col20,feature_col21,feature_col22,feature_col23,feature_col24,feature_col25,feature_col26,feature_col27,feature_col28,feature_col29,feature_col30,feature_col31,feature_col32,feature_col33,feature_col34,feature_col35,feature_col36,feature_col37,feature_col38,feature_col39,feature_col40,feature_col41,feature_col42,feature_col43,feature_col44,feature_col45,feature_col46,feature_col47,feature_col48,feature_col49,feature_col50,feature_col51,feature_col52,feature_col53,feature_col54,feature_col55,feature_col56,feature_col57,feature_col58,feature_col59
1,0,0,0,0,1,0,0,1,0,0,1,0,0,1,1,0,0,1,0,0,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,1,0,0,0,1,0,0,1,0,0,1,0,0,1
trainLabel.txt
1,4,feature_col0,feature_col1,feature_col2,feature_col3
1,1,1,0
的test.txt
1,60,feature_col0,feature_col1,feature_col2,feature_col3,feature_col4,feature_col5,feature_col6,feature_col7,feature_col8,feature_col9,feature_col10,feature_col11,feature_col12,feature_col13,feature_col14,feature_col15,feature_col16,feature_col17,feature_col18,feature_col19,feature_col20,feature_col21,feature_col22,feature_col23,feature_col24,feature_col25,feature_col26,feature_col27,feature_col28,feature_col29,feature_col30,feature_col31,feature_col32,feature_col33,feature_col34,feature_col35,feature_col36,feature_col37,feature_col38,feature_col39,feature_col40,feature_col41,feature_col42,feature_col43,feature_col44,feature_col45,feature_col46,feature_col47,feature_col48,feature_col49,feature_col50,feature_col51,feature_col52,feature_col53,feature_col54,feature_col55,feature_col56,feature_col57,feature_col58,feature_col59
0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,1,0,0,1,0,0,1,0,0,1,0,0,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1
testLabel.txt
1,4,feature_col0,feature_col1,feature_col2,feature_col3
1,1,0,0
dpNum表示feature_col
我想输入一些像train.txt
这样的数据 [1 ,0..........., 1] # a rank 1 tensor; this is a vector with shape [60]
,
预测
[1,0,0,1] # a rank 1 tensor; this is a vector with shape [4]
答案 0 :(得分:1)
来自tutorials页面:
# Fit model.
classifier.fit(x=training_set.data,
y=training_set.target,
steps=2000)
即。您可以通过调用training_set.target
来访问目标,这应该为您提供每个数据点的标签。
另外,我不确定你是否对某些术语感到困惑:你说训练数据集有15'000个数据点,但只有1'000个标签,(至少对于Iris数据集)并没有太多因为我相信整个数据集都是标记的。你的意思是说你有15,000个训练样本和1000个测试样本吗?
所以,不确定以下所有内容是否已经清楚,但如果没有,希望它能为您解决问题。假设Iris数据集看起来像这样(取自Wikipedia):
Sepal length Sepal width Petal length Petal width Species
5.1 3.5 1.4 0.2 I. setosa
4.9 3.0 1.4 0.2 I. setosa
4.7 3.2 1.3 0.2 I. setosa
....
5.1 2.5 3.0 1.1 I. versicolor
5.7 2.8 4.1 1.3 I. versicolor
现在通常使用以下术语:
I. setosa
或I. versicolor
)中的最后一列。通常,标签以某种方式编码,例如,对于0
和I. setosa
,标签为1
,否则在您的问题中提示。但是,可能不仅仅是那两个可能的标签。例如。在Iris数据集中,通常还有第三朵花叫I. virginica
。