我有一个完全动态的关键字脚本的想法,这将允许我简单地为我的网站写内容/当人们在我的网站上发布并且从他们发布的内容自动生成关键字时......我已经探索过了这个方法如下,但我不确定如何继续前进。非常感谢任何帮助。
<?php
$content = "everything inside the body of the page";
$common = array(' a ', ' the ', ' I ');
$replaced = str_replace($common, ' ', strip_tags($content));
$array = str_word_count($replaced, 1);
$count = array_count_values( $array );
?>
该代码从页面中获取内容,从中删除HTML标记,从所有内容创建一个数组,每个单词都有一个值,表示在页面中使用的次数。
如何为超过X次使用的单词过滤此数组?
编辑:感谢Jan提供他们的解决方案,对我需要做的事情非常有帮助,但最后稍微改变了一点(不要太讨厌我,但我把它合并为一行以节省空间)if ( isset($page['content']) and $page['content'] != ' ' ) {
foreach ( array_count_values(str_word_count(str_replace(array('nbsp', ' nbsp ', ' something ', ' that ', ' does ', 'that', ' that ', ' have ',' with', ' this ', ' from ', ' they ', ' will ', ' would ', ' there ', ' their ', ' what ', ' about ', ' which ', ' when ', ' make ', ' like ', ' time ', ' just ', ' know ', ' take ', ' person ', ' into ', ' year ', ' your ', ' good ', ' some ', ' could ', ' them ', ' other ', ' than ', ' then ', ' look ', ' only ', ' come ', ' over ', ' think ', ' also ', ' back ', ' after ', ' work ', ' first ', ' well ', ' even ', ' want ', ' because ', ' these ', ' give ', ' most '), ' ', strip_tags($page['content'])), 1)) as $keyword => $frequency ) {
if ( $frequency >= '3' and strlen($keyword) >= '4' and strlen($keyword) <= '10' and strpos($keywords, $keyword) === false ) {
$keywords .= strtolower($keyword).', ';
}
}
echo '<meta name="keywords" content="'.trim($keywords, ", ").'"/>';
}
答案 0 :(得分:0)
通过将单词添加到新数组来考虑过滤。每次要从旧数组添加一个单词时,检查新数组中是否已存在该单词,并使用if语句阻止它添加(如果已存在)
答案 1 :(得分:0)
添加
arsort($count);
并根据需要使用最多计数的密钥
答案 2 :(得分:-1)
你可以迭代数组并测试每个键的值,如果它足够高,那么将键作为值添加到新数组中。
$min_count = 1; // Number of times the word should be found inside the content to be considered as a keyword
$keywords = array();
foreach ( $count as $keyword => $value ) {
if ( $value >= $min_count ) {
$keywords[] = $keyword;
}
}
$keywords
现在拥有您感兴趣的词语。
答案 3 :(得分:-1)
enterButton.addActionListener(new ActionListener() {
@Override
public void actionPerformed(ActionEvent e2) {
log=textArea.getText();
pass=textArea2.getText();
Session session = factory.openSession();
Transaction tx = null;
try{
tx = session.beginTransaction();
List students = session.createQuery("FROM Student").list();
for (Iterator iterator =
students.iterator(); iterator.hasNext();){
Student student = (Student) iterator.next();
if((student.getLogin().equals(log))&&(student.getPassword().equals(pass))){
ID=student.getId();//this should be returned
JOptionPane.showMessageDialog(null,"return="+ID);
break;
}
}
tx.commit();
}catch (HibernateException e) {
if (tx!=null) tx.rollback();
e.printStackTrace();
}finally {
session.close();
}
}
});
return ID; //returns 0
}
示例输出:
<?php
// exclude words appearing more than this many times
$limit = 3;
// exclude these words
$wordsToExclude = array('a', 'the');
// the content
$content = "everything inside the body of the page a a a test test test test don't feed the elephants inside";
// better way of splitting into words - http://stackoverflow.com/questions/790596/split-a-text-into-single-words
$words = preg_split('/((^\p{P}+)|(\p{P}*\s+\p{P}*)|(\p{P}+$))/', $content, -1, PREG_SPLIT_NO_EMPTY);
// count how many times each word appears. this will create an array with words as the keys, and counts as the values
$uniqueWords = array_count_values($words);
foreach($uniqueWords as $word => $count)
{
// remove excluded words, and words appearing more times than the limit
if (in_array($word, $wordsToExclude) || $count > $limit) {
unset($uniqueWords[$word]);
}
}
var_dump($uniqueWords);
您可以使用所有单词(使用array (size=8)
'everything' => int 1
'inside' => int 2
'body' => int 1
'of' => int 1
'page' => int 1
'don't' => int 1
'feed' => int 1
'elephants' => int 1
)或使用计数作为某种形式的加权。