我创建了一个数组来获取文件,然后解析该文件的内容。我已使用if(strlen($value) < 4): unset($content[$key]); endif;
我的问题是这个 - 我想删除数组中的常用单词,但其中有很多单词。我不知道在每个数组值上反复进行这些检查,我想知道是否有更有效的方法来做到这一点?
以下是我目前正在使用的代码示例。这个列表可能很大,我认为必须有一个更好(更有效)的方式?
foreach ($content as $key=>$value) {
if(strlen($value) < 4): unset($content[$key]); endif;
if($value == 'that'): unset($content[$key]); endif;
if($value == 'have'): unset($content[$key]); endif;
if($value == 'with'): unset($content[$key]); endif;
if($value == 'this'): unset($content[$key]); endif;
if($value == 'your'): unset($content[$key]); endif;
if($value == 'will'): unset($content[$key]); endif;
if($value == 'they'): unset($content[$key]); endif;
if($value == 'from'): unset($content[$key]); endif;
if($value == 'when'): unset($content[$key]); endif;
if($value == 'then'): unset($content[$key]); endif;
if($value == 'than'): unset($content[$key]); endif;
if($value == 'into'): unset($content[$key]); endif;
}
答案 0 :(得分:2)
也许这会更好:
$filter = array("that","have","with",...);
foreach ($content as $key=>$value) {
if (in_array($value,$filter)){
unset($content[$key])
}
}
答案 1 :(得分:2)
我是这样做的:
$exlcuded_words = array( 'that','have','with','this','your','will','they','from','when','then','than','into');
$replace = array_fill_keys($exlcuded_words,'');
echo str_replace(array_keys($replace),$replace,'some words that have to be with this your will they have from when then that into replaced');
它的工作方式:创建一个充满空字符串的数组,其中键是要删除/替换的子字符串。只使用str_replace
,将键作为第一个参数传递,数组本身作为第二个参数,在这种情况下的结果是:some words to be replaced
。此代码已经过测试,可以正常使用。
当处理一个数组时,只需要用一些古怪的分隔符(比如%@%@%
或其他东西)和str_replace
来破坏它,再次爆炸,然后鲍勃是你的叔叔
当要更换少于3个字符的所有单词时(我在原来的答案中忘记了),这是正则表达式擅长的东西......我会说像preg_replace('(\b|[^a-z])[a-z]{1,3}(\b|[^a-z])/i','$1$2',implode(',',$targetArray));
或类似的东西那。你可能想要测试一下,因为这只是我的头脑,并且没有经过测试。但这似乎足以让你开始
答案 2 :(得分:1)
我可能会这样做:
$aCommonWords = array('that','have','with','this','yours','etc.....');
foreach($content as $key => $value){
if(in_array($value,$aCommonWords)){
unset($content[$key]);
}
}
答案 3 :(得分:1)
创建要删除的单词数组,并检查该值是否在该数组中
$exlcuded_words = array( 'that','have','with','this','your','will','they','from','when','then','than','into');
以及foreach
if (in_array($value, $excluded_words)) unset($content[$key];
答案 4 :(得分:0)
另一种可能的解决方案:
$arr = array_flip(array( 'that', 'have', 'with', 'this', 'your', 'will',
'they', 'from', 'when', 'then', 'than', 'into' ));
foreach ($content as $key=>$value) {
if(strlen($value) < 4 || isset($arr[$value])) {
unset($content[$key]);
}
}
答案 5 :(得分:0)
使用array_diff()
:
$content = array('here','are','some','words','that','will','be','filtered');
$filter = array('that','have','here','are','will','they','from','when','then');
$result = array_diff($content, $filter);
结果:
Array
(
[2] => some
[3] => words
[6] => be
[7] => filtered
)
或者,如果要在过滤内容方面具有更大的灵活性(例如,您提到需要过滤出少于4个字符的单词),则可以使用array_filter()
:
$result = array_filter($content, function($v) use ($filter) {
return !in_array($v, $filter) && strlen($v) >= 4;
});
结果:
Array
(
[2] => some
[3] => words
[7] => filtered
)
答案 6 :(得分:0)
system.time(t(m))
# user system elapsed
# 23.990 23.416 85.722
system.time(t(dt))
# user system elapsed
# 31.223 53.197 195.221
system.time(t(df))
# user system elapsed
# 30.609 45.404 148.323
system.time(setDT(transpose(dt)))
# user system elapsed
# 42.135 38.478 116.599
结果:
$var = array('abb', 'bffb', 'cbbb', 'dddd', 'dddd', 'f', 'g');
$var= array_unique($var);
foreach($var as $val){
echo $val. " ";
}
最简单的方法