如何在PHP的utf8文本中突出显示非utf8字符串?

时间:2018-10-05 08:53:16

标签: php utf-8 preg-replace

我找不到通过忽略UTF8符号突出显示PHP中匹配项的解决方案。

代码示例:

$text = "Lorem Ipsum – tas ir teksta salikums, kuru izmanto poligrāfijā un maketēšanas darbos. Lorem Ipsum ir kļuvis par vispārpieņemtu teksta aizvietotāju kopš 16. gadsimta sākuma. Tajā laikā kāds nezināms iespiedējs izveidoja teksta fragmentu, lai nodrukātu grāmatu ar burtu paraugiem.";
$keywordsNotWorking = ["poligrafija", "kops"];
$keywordsWorking = ["poligrāfijā", "kopš"];

function highlightFoundText($text, $keywords, $tag = "b")
{
  foreach ($keyword as $key){
    $text = preg_replace("/\p{L}*?".preg_quote($key)."\p{L}*/ui", "<".$tag.">$0</".$tag.">", $text);
  }
  return $text;
}

如果我使用$keywordsWorking,则一切正常,但是使用$keywordsNotWorking时,则找不到匹配的结果。 请帮助我找到解决方案,该如何忽略UTF8符号来突出显示匹配项。

1 个答案:

答案 0 :(得分:0)

最后,我提出了可行的解决方案。 发布答案,如果有人会遇到同样的问题。

class Highlighter
{
    private $_text;
    private $_keywords;

    private $keywords;
    private $text;

    private $tag = "b";

    public function highlight($text, $keywords)
    {
        $this->text = $text;
        $this->keywords = (array) $keywords;

        if(count($keywords) > 0)
        {
            $this->prepareString();
            $this->highlightStrings();
        }

        return $this->text;
    }

    private function unicodeSymbols()
    {
        return [
            'ā' => 'a',
            'č' => 'c',
            'ē' => 'e',
            'ī' => 'i',
            'ķ' => 'k',
            'ļ' => 'l',
            'ņ' => 'n',
            'š' => 's',
            'ū' => 'u',
            'ž' => 'z'
        ];
    }

    private function clearVars()
    {
        $this->_text = null;
        $this->_keywords = [];
    }

    private function prepareString()
    {
        $this->clearVars();

        $this->_text = strtolower( strtr($this->text, $this->unicodeSymbols()) );

        foreach ($this->keywords as $keyword)
        {
            $this->_keywords[] = strtolower( strtr($keyword, $this->unicodeSymbols()) );
        }
    }

    private function highlightStrings()
    {
        foreach ($this->_keywords as $keyword)
        {

            if(strlen($keyword) === 0) continue;

            // find cleared keyword in cleared text.
            $pos = strpos($this->_text, $keyword);

            if($pos !== false)
            {

                $keywordLength = strlen($keyword);

                // find original keyword.
                $originalKeyword = mb_substr($this->text, $pos, $keywordLength);

                // highlight in both texts.
                $this->text = str_replace($originalKeyword, "<{$this->tag}>".$originalKeyword."</{$this->tag}>", $this->text);
                $this->_text = str_replace($keyword, "<{$this->tag}>".$keyword."</{$this->tag}>", $this->_text);
            }

        }
    }

    public function setTag($tag)
    {
        $this->tag = $tag;
    }
}