“除以零” / NULL值TF-IDF

时间:2019-01-02 11:36:37

标签: php laravel machine-learning tf-idf

我正在尝试从PHP-ML测试tf-idf软件包,我尝试使用他们的文档代码,但是当我尝试使用不同的样本(字符串)时,它一直给我“零除”。

use Phpml\FeatureExtraction\TfIdfTransformer;

$samples = [
    ["Tareq", "Tareq", "Tareq"],
    ["Mohammad", "Ahmad", "Tareq"]
];

$transformer = new TfIdfTransformer($samples);
dd($transformer);

当我尝试使用其文档中提供的示例中的transform方法时

 $samples = [
            [1, 2, 4],
            [0, 2, 1]
        ];

$transformer = new TfIdfTransformer($samples);
$transformer->fit($samples);
dd($transformer->transform($samples));

它给了我NULL。

Tfdftransformer.php:

            <?php

    declare(strict_types=1);

    namespace Phpml\FeatureExtraction;

    use Phpml\Transformer;

class TfIdfTransformer implements Transformer
{
    /**
     * @var array
     */
    private $idf = [];

public function __construct(array $samples = [])
{
    if (count($samples) > 0) {
        $this->fit($samples);
    }
}

public function fit(array $samples, ?array $targets = null): void
{
    $this->countTokensFrequency($samples);

    $count = count($samples);
    foreach ($this->idf as &$value) {
        $value = log((float) ($count / $value), 10.0);
    }
}

public function transform(array &$samples): void
{
    foreach ($samples as &$sample) {
        foreach ($sample as $index => &$feature) {
            $feature *= $this->idf[$index];
        }
    }
}

private function countTokensFrequency(array $samples): void
{
    $this->idf = array_fill_keys(array_keys($samples[0]), 0);

    foreach ($samples as $sample) {
        foreach ($sample as $index => $count) {
            if ($count > 0) {
                ++$this->idf[$index];
            }
        }
    }
}

}

0 个答案:

没有答案