使用php-ml的支持向量分类无法正常工作

时间:2019-01-20 22:53:59

标签: php deep-learning classification text-classification

我开始玩深度学习的东西。 我尝试构建的第一件事是文本分类器。

基本上,它应该在文本行中扫描诸如cpu名称之类的模式。 告诉我,是否有可能这是我要寻找的那条线。

我在带有php-ml 7.0的PHP 7.1上尝试过,那就是我的代码:

<?php

require 'vendor/autoload.php';

use Phpml\Classification\SVC;
use Phpml\SupportVectorMachine\Kernel;
use Phpml\FeatureExtraction\TokenCountVectorizer;
use Phpml\Tokenization\WhitespaceTokenizer;

$vectorizer = new TokenCountVectorizer(new WhitespaceTokenizer());

$data = array_map('str_getcsv', file('csv/cpuRaw.csv'));
$samples = array();
$labels = array();

foreach ($data as $block) {
  $samples[] = $block[1];
  $labels[] = $block[0];
}

// Build the dictionary.
$vectorizer->fit($samples);

// Transform the provided text samples into a vectorized list.
$vectorizer->transform($samples);

$classifier = new SVC(
    Kernel::LINEAR, // $kernel
    1.0,            // $cost
    3,              // $degree
    null,           // $gamma
    0.0,            // $coef0
    0.001,          // $tolerance
    100,            // $cacheSize
    true,           // $shrinking
    true            // $probabilityEstimates, set to true
);

$classifier->train($samples, $labels);

$test = ['This is a sentence.'];
$vectorizer->fit($test);
$vectorizer->transform($test);

var_dump($classifier->predictProbability($test));

?>

CSV包含以下内容:

"CPU","Intel Xeon E3-1270V3"
"CPU","Intel Core i7-930"
"CPU","Intel Core i7-950"
"CPU","Intel Core i7-980x"
"CPU","Intel Xeon E3-1271V3"
"CPU","Intel Core i7-975"
"CPU","Intel Core i7-965"
"CPU","Intel Xeon E3-1275"
"CPU","Intel Core i7-980"
"CPU","Intel Core i7-990x"
"CPU","Intel Core i7-960x"

总是,无论是什么赋予我1的概率,我都认为它等于100%。但这并不正确,它只应在特定行上返回。

有人可以告诉我,如果我干了,为什么?

0 个答案:

没有答案