我开始玩深度学习的东西。 我尝试构建的第一件事是文本分类器。
基本上,它应该在文本行中扫描诸如cpu名称之类的模式。 告诉我,是否有可能这是我要寻找的那条线。
我在带有php-ml 7.0的PHP 7.1上尝试过,那就是我的代码:
<?php
require 'vendor/autoload.php';
use Phpml\Classification\SVC;
use Phpml\SupportVectorMachine\Kernel;
use Phpml\FeatureExtraction\TokenCountVectorizer;
use Phpml\Tokenization\WhitespaceTokenizer;
$vectorizer = new TokenCountVectorizer(new WhitespaceTokenizer());
$data = array_map('str_getcsv', file('csv/cpuRaw.csv'));
$samples = array();
$labels = array();
foreach ($data as $block) {
$samples[] = $block[1];
$labels[] = $block[0];
}
// Build the dictionary.
$vectorizer->fit($samples);
// Transform the provided text samples into a vectorized list.
$vectorizer->transform($samples);
$classifier = new SVC(
Kernel::LINEAR, // $kernel
1.0, // $cost
3, // $degree
null, // $gamma
0.0, // $coef0
0.001, // $tolerance
100, // $cacheSize
true, // $shrinking
true // $probabilityEstimates, set to true
);
$classifier->train($samples, $labels);
$test = ['This is a sentence.'];
$vectorizer->fit($test);
$vectorizer->transform($test);
var_dump($classifier->predictProbability($test));
?>
CSV包含以下内容:
"CPU","Intel Xeon E3-1270V3"
"CPU","Intel Core i7-930"
"CPU","Intel Core i7-950"
"CPU","Intel Core i7-980x"
"CPU","Intel Xeon E3-1271V3"
"CPU","Intel Core i7-975"
"CPU","Intel Core i7-965"
"CPU","Intel Xeon E3-1275"
"CPU","Intel Core i7-980"
"CPU","Intel Core i7-990x"
"CPU","Intel Core i7-960x"
总是,无论是什么赋予我1的概率,我都认为它等于100%。但这并不正确,它只应在特定行上返回。
有人可以告诉我,如果我干了,为什么?