I've this simple code:
function getCleanText($rawText) //removes doublespace and punctuation
{
return strtolower(preg_replace("/[\s\t]+/u", " ",
preg_replace("/[^a-zA-Z1-9àèéìòù]+/u", " ", $rawText)));
}
echo getCleanText("uscì"). " uscì <br>";
the function just removes punctuation and double spaces. Why i've this output?
usc�� uscì
I mean "uscì" doesn't have any punctuation and the function is supposed to return it as it is without modification. Still i've problem with all accented letters. The web page is encoded in UTF-8. if i try with utf_encode like this
return utf8_encode(strtolower(preg_replace("/[\s\t]+/u", " ",
preg_replace("/[^a-zA-Z1-9àèéìòù]+/u", " ", $rawText))));
the output is
usc㬠uscì
any ideas? Where i can find some documentation to understand my error?
答案 0 :(得分:1)
使用mb_strtolower
,而不只是strtolower
解决了我的测试中的问题。我认为这是一个php.ini
配置问题,这意味着它适用于某些人而不是其他人。