如何在胶合时分离(带有空格)字符串,使用数组中的键来检查它是否粘合?
胶合:sisteralannis
,goodplace
(替换为:sister alannis
,good place
)
注意:两者都有数组中现有键的起始部分:姐妹,好,但它们不是正确的键,所以不能发生替换,所以我需要将它们分开,所以可以在脚本的下一步中进行替换。另一个解决方案是删除与$ myWords
中的键不完全相同的所有内容此代码用于替换字符串,我想要改进,验证字符串是否被粘合的代码,并在它们之间添加空格,将它们分开:
$myVar = "my sisteralannis is not that blonde, here is a goodplace";
$myWords=array(
array("is","é"),
array("on","no"),
array("that","aquela"),
array("sister","irmã"),
array("my","minha"),
array("myth","mito"),
array("he","ele"),
array("good","bom"),
array("ace","perito")
);
usort($myWords,function($a,$b){return mb_strlen($b[0])<=>mb_strlen($a[0]);}); // sort subarrays by first column multibyte length
// remove mb_ if first column holds no multi-byte characters. strlen() is much faster.
foreach($myWords as &$words){
$words[0]='/\b'.$words[0].'\b/ui'; // generate patterns using search word, word boundaries, and case-insensitivity
}
$myVar=preg_replace(array_column($myWords,0),array_column($myWords,1),$myVar);
//APPLY SECOND SOLUTION HERE
echo $myVar;
预期输出:minha irmã alannis é not aquela blonde, here é a bom place
。
=================
2ª解决方案更简单: 在$ myVar和$ myWords之间进行匹配,并删除$ myWords中不存在的任何内容。
将删除数组中找不到的变量的所有字符串!
输出 minha é aquela, é
答案 0 :(得分:1)
我不会说我100%确信这将处理所有可能的情况,但它确实适用于您的输入字符串,我确实构建它以容纳首字母大写的单词。除此之外,可能还有一些边缘情况需要进行一些调整。
有一些内联解释可以帮助理解代码。
代码:(Demo)
$myVar = "My sisteralannis is not that blonde, here is a goodplace";
$myWords=[["is","é"],["on","no"],["that","aquela"],["sister","irmã"],["my","minha"],
["myth","mito"],["he","ele"],["good","bom"],["ace","perito"]];
usort($myWords,function($a,$b){return strlen($b[0])<=>strlen($a[0]);}); // longer English words before shorter
$search=array_column($myWords,0); // cache for multiple future uses
//input: "My sisteralannis is not that blonde, here is a goodplace";
//filter: ++ ------------- ++ --- ++++ ------ ---- ++ - ---------
//output: Minha é aquela , é
$disqualifying_pattern='/ ?\b(?>'.implode('|',$search).')\b(*SKIP)(*FAIL)| ?[a-z]+/i'; // this handles the spaces for the sample input, might not work for all cases
//echo $disqualifying_pattern,"\n";
$filtered=preg_replace($disqualifying_pattern,'',$myVar);
//echo $filtered,"\n";
$patterns=array_map(function($v){return '/\b'.$v.'\b/i';},$search);
$replace=array_column($myWords,1);
echo preg_replace_callback(
$patterns,
function($m)use($patterns,$replace){
$new=preg_replace($patterns,$replace,$m[0],1); // tell it to stop after replacing once
if(ctype_upper($m[0][0])){ // if first letter of English word is uppercase
$mb_ucfirst=mb_strtoupper(mb_substr($new,0,1)); // target and make upper, first letter of Portugese word
return $mb_ucfirst.mb_substr($new, 1); // apply new uppercase letter to the rest of the Portugese word
}
return $new;
},
$filtered
);
输出:
Minha é aquela, é