PHP数组未合并

时间:2017-06-24 12:26:27

标签: php arrays sorting ranking array-merge

我有这个代码从文本文件执行单词排名。它打开文件并输出一个数组,显示文件中每个单词出现的次数。这部分很有效,但在第二部分,代码将查看给定文件夹中的每个其他文本文件,并输出每个单词作为所有文件的总计出现的次数。问题是输出数组不是合并的总数。有重复。例如,我得到 -

the -- 2
quick -- 1
brown -- 1
fox -- 1
jumped -- 1
over -- 1
lazy -- 1
dog -- 1
dog -- 2
a -- 2
lazy -- 1
fox -- 1
cannot -- 1
catch -- 1
fast -- 1
the -- 1
may -- 1
be -- 1

而不是 -

the -- 3
dog -- 3
fox -- 2
lazy -- 2
a -- 2
quick -- 1
brown -- 1
jumped -- 1
over -- 1
very -- 1
cannot -- 1
catch -- 1
fast -- 1
may -- 1
be -- 1

这是整个代码 -

<?php
echo "<h3>Word Rank From One File</h3>";
$counted = strtolower(file_get_contents("docs/one.txt"));
$wordArray = preg_split('/[^a-z]/', $counted, -1, PREG_SPLIT_NO_EMPTY);
$wordFrequencyArray = array_count_values($wordArray);

/* Sort array from higher to lower, keeping keys */
arsort($wordFrequencyArray);

/* grab Top 10, huh sorted? */
$top10words = array_slice($wordFrequencyArray,0,10);

/* display them */
foreach ($top10words as $topWord => $frequency)
    echo "$topWord --  $frequency<br/>";

echo "<h3>Total From All Files</h3>";
$path = realpath('docs');
foreach(glob($path.'/*.*') as $file) {
    $counted = strtolower(file_get_contents($file));
    $wordArray = preg_split('/[^a-z]/', $counted, -1, PREG_SPLIT_NO_EMPTY);
    $wordFrequencyArray = array_count_values($wordArray);
    $combine = array_merge($wordFrequencyArray);
    /* Sort array from higher to lower, keeping keys */
    arsort($wordFrequencyArray);

    /* grab Top 10, huh sorted? */
    $top10words = array_slice($wordFrequencyArray,0,10);

    /* display them */
    foreach ($top10words as $topWord => $frequency)
        echo "$topWord --  $frequency<br/>";
    }

?>

我做错了什么或不做什么? 两个示例文本文件有;

  

快速的棕色狐狸跳过懒狗。那只狐狸跳起来的狗跑得这么快。

  一只懒狐狸抓不到快狗。狗可能很快。   我也注意到有些单词被忽略了。

1 个答案:

答案 0 :(得分:1)

您必须汇总文件中的所有字词,然后计算其频率。

$wordArrayTotal = [];
foreach (glob($path.'/*.*') as $file) {
    $counted = strtolower(file_get_contents($file));
    $wordArray = preg_split('/[^a-z]/', $counted, -1, PREG_SPLIT_NO_EMPTY);
    $wordArrayTotal = array_merge($wordArrayTotal, $wordArray);
}

$wordFrequencyArray = array_count_values($wordArrayTotal);

/* Sort array from higher to lower, keeping keys */
arsort($wordFrequencyArray);

/* grab Top 10, huh sorted? */
$top10words = array_slice($wordFrequencyArray, 0, 10);

/* display them */
foreach ($top10words as $topWord => $frequency) {
    echo "$topWord --  $frequency<br/>";
}