合并数组和字频

时间:2011-10-02 00:37:39

标签: php arrays word-frequency

所以我骑自行车浏览了41段的文件。对于每个段落,我试图[1]首先将字符串分解为数组,然后获取段落的单词频率。然后我想要合并所有段落的数据并得到整个文档的单词频率。

我能够获得给出给定字段的“字”和“频率”的数组,但是我在合并每个段落的结果时遇到了麻烦,以便获得整个文档的“字频”这就是我所拥有的:

function sectionWordFrequency($sectionFS)
{
$section_frequency = array();
$filename = $sectionFS . ".xml";
$xmldoc = simplexml_load_file('../../editedtranscriptions/' . $filename);
$xmldoc->registerXPathNamespace("tei", "http://www.tei-c.org/ns/1.0");
$paraArray = $xmldoc->xpath("//tei:p");

foreach ($paraArray as $p)
{
$para_frequency = (array_count_values(str_word_count(strtolower($p), 1)));
$section_frequency[] = $para_frequency;
}


return array_merge($section_frequency);
}

/// now I call the function, sort it, and try to display it
$section_frequency = sectionWordFrequency($fs); 
ksort($section_frequency);

foreach ($section_frequency as $word=>$frequency)
{
 echo $word . ": " . $frequency . "</br>";
}

现在我得到的结果是:

1:数组 2:数组 3:数组 4:数组

非常感谢任何帮助。

1 个答案:

答案 0 :(得分:0)

尝试替换此行

$section_frequency[] = $para_frequency;

用这个

$section_frequency = array_merge($section_frequency, $para_frequency);

然后

return $section_frequency