我想基于使用PHP的文本中最重复的单词自动生成标题。 示例:如果单词“PHP”在文本中重复最多标题将是:“文本是关于PHP”....等等。 我不知道该做什么或从哪里开始。
任何人都可以帮我吗?
答案 0 :(得分:3)
如果我必须为您完成家庭作业,我需要在论文中完整归属,并在论文中提供此问题的链接。
我还要求您实际阅读,理解并尝试运行此代码以使您能够理解它。
//get all the test from the file
$text_from_file = file_get_contents("filename.txt");
//get all the words within that text
$words = str_word_count($text_from_file , 1);
//count up all the unique words within the array
$unique = array_count_values($words);
//sort by most to least frequent
arsort($unique); //arsort required to keep keys and values together
//since we dont know the key values here, we need to use foreach
foreach($unique as $key => $val) {
echo("The most common word is " . $key . " which occurs " . $val . " times");
break; //always break after the first echo
}
答案 1 :(得分:1)
<?php
function mostRepeated($string = false, $words_num = 5) {
$string = strtolower($string);
// extend this array
$omit_words = array('the', 'a', 'an', 'in', 'at', 'by', 'of', 'was', 'is', 'he', 'she');
$words = explode(' ', $string);
foreach($words as $k => $v) {
if(in_array($word, $omit_words)) unset($words[$k]);
}
$count = array_count_values($words);
arsort($count);
$result = array();
foreach($count as $k => $v) {
$result[] = $k;
}
return $result;
}
$text = 'PHP foo Bar php foO pHp';
$most_repeated_words_array = mostRepeated($text, 3);
print_r($most_repeated_words_array);
?>
输出:
Array
(
[0] => php
[1] => foo
[2] => bar
)
答案 2 :(得分:0)
使用
print_r( array_count_values(str_word_count($text, 1)) );
将为您提供所有单词的计数。然后,您可以在排序时选择最顶层的?
rsort
将为您提供从高到低的排序数组