我的代码中有2个数组,如下所示:
<?php
$kalimat = "I just want to search something like visual odometry, dude";
$kata = array();
$eliminasi = " \n . ,;:-()?!";
$tokenizing = strtok($kalimat, $eliminasi);
while ($tokenizing !== false) {
$kata[] = $tokenizing;
$tokenizing = strtok($eliminasi);
}
$sumkata = count($kata);
print "<pre>";
print_r($kata);
print "</pre>";
//stop list
$file = fopen("stoplist.txt","r") or die("fail to open file");
$stoplist;
$i = 0;
while($row = fgets($file)){
$data = explode(",", $row);
$stoplist[$i] = $data;
$i++;
}
fclose($file);
$count = count($stoplist);
//Cange 2 dimention array become 1 dimention
for($i=0;$i<$count;$i++){
for($j=0; $j<1; $j++){
$stopword[$i] = $stoplist[$i][$j];
}
}
//Filtering process
$hasilfilter = array_diff($kata,$stopword);
var_dump($hasilfilter);
?>
&#13;
$ stopword包含http://xpo6.com/list-of-english-stop-words/
中附加的一些停用词我想做的就是:我想检查是否保存数组$ kata中存在的元素,并且它不存在于数组$ stopword
中所以我想删除数组$ kata和$ stopword中存在的所有元素。 我读了一些使用array_diff的建议,但不知怎的,它对我不起作用。真的需要你的帮助:(谢谢。
答案 0 :(得分:0)
array_diff
是你需要的,你是对的。以下是您尝试执行的简化版本:
<?php
// Your string $kalimat as an array of words, this already works in your example.
$kata = ['I', 'just', 'want', 'to', '...'];
// I can't test $stopword code, because I don't have your file.
// So let's say it's a array with the word 'just'
$stopword = ['just'];
// array_diff gives you what you want
var_dump(array_diff($kata,$stopword));
// It will display your array minus "just": ['I', 'want', 'to', '...']
您还应该仔细检查$stopword
的值,我无法测试此部分(没有您的文件)。如果它不适合你,我想问题是这个变量($stopword
)
答案 1 :(得分:0)
您的$stopword
数组存在问题。 var_dump它可以看到问题。array_diff
工作正常。
尝试使用我编写的代码使您的$stopword
数组正确:
<?php
$kalimat = "I just want to search something like visual odometry, dude";
$kata = array();
$eliminasi = " \n . ,;:-()?!";
$tokenizing = strtok($kalimat, $eliminasi);
while ($tokenizing !== false) {
$kata[] = $tokenizing;
$tokenizing = strtok($eliminasi);
}
$sumkata = count($kata);
print "<pre>";
print_r($kata);
print "</pre>";
//stop list
$file = fopen("stoplist.txt","r") or die("fail to open file");
$stoplist;
$i = 0;
while($row = fgets($file)){
$data = explode(",", $row);
$stoplist[$i] = $data;
$i++;
}
fclose($file);
$count = count($stoplist);
//Cange 2 dimention array become 1 dimention
$stopword= call_user_func_array('array_merge', $stoplist);
$new = array();
foreach($stopword as $st){
$new[] = explode(' ', $st);
}
$new2= call_user_func_array('array_merge', $new);
foreach($new2 as &$n){
$n = trim($n);
}
$new3 = array_unique($new2);
unset($stopword,$new,$new2);
$stopword = $new3;
unset($new3);
//Filtering process
$hasilfilter = array_diff($kata,$stopword);
print "<pre>";
var_dump($hasilfilter);
print "</pre>";
?>
我希望它有所帮助