如何使用PHP链接HTML中每个单词的实例?

时间:2015-04-10 09:37:02

标签: php html regex html-parsing

我有一个HTML字符串,我希望找到特定单词的每个实例并自动将其链接到页面。例如,找到单词'主页'在HTML字符串中,并将其链接到网站的主页。

我找到了以下代码片段,它完成了大部分逻辑:

http://aidanlister.com/2004/04/highlighting-a-search-string-in-html-text/

然而,它似乎没有考虑:

  1. 如果单词在HTML元素的属性中(即:img title属性,标记href属性)。这打破了代码。
  2. 如果单词已经在链接中,则不处理它(保持链接原样)。
  3. HTML字符串:

    <h1>Hello, welcome to my site</h1>
    
    <p>This is my site, if you want to go back to the homepage, just <a href="http://www.example.com">click here</a>.</p>
    
    <a href="http://www.example.com" title="my homepage"><img src="/images/homepage.jpg" title="homepage screenshot" /></a>
    

    PHP:

    <?
    
    echo str_highlight($html,'homepage','wholeword|striplinks','<a href="http://www.example.com">Homepage</a>');
    
     ?>
    

    功能:

    function str_highlight($text, $needle, $options = null, $highlight = null)
        {
            // Default highlighting
            if ($highlight === null) {
                $highlight = '<strong>\1</strong>';
            }
    
            // Select pattern to use
            if ($options & 'simple') {
                $pattern = '#(%s)#';
                $sl_pattern = '#(%s)#';
            } else {
                $pattern = '#(?!<.*?)(%s)(?![^<>]*?>)#';
                $sl_pattern = '#<a\s(?:.*?)>(%s)</a>#';
            }
    
            // Case sensitivity
            if (!($options & 'casesensitive')) {
                $pattern .= 'i';
                $sl_pattern .= 'i';
            }
    
            $needle = (array) $needle;
            foreach ($needle as $needle_s) {
                $needle_s = preg_quote($needle_s);
    
                // Escape needle with optional whole word check
                if ($options & 'wholeword') {
                    $needle_s = '\b' . $needle_s . '\b';
                }
    
                // Strip links
                if ($options & 'striplinks') {
                    $sl_regex = sprintf($sl_pattern, $needle_s);
                    $text = preg_replace($sl_regex, '\1', $text);
                }
    
                $regex = sprintf($pattern, $needle_s);
                $text = preg_replace($regex, $highlight, $text);
            }
    
            return $text;
        }
    

1 个答案:

答案 0 :(得分:1)

替换

// Select pattern to use
    if ($options & 'simple') {
        $pattern = '#(%s)#';
        $sl_pattern = '#(%s)#';
    } else {
        $pattern = '#(?!<.*?)(%s)(?![^<>]*?>)#';
        $sl_pattern = '#<a\s(?:.*?)>(%s)</a>#';
    }

if ($options & 'simple') {
        $pattern = '#(%s)#';
        $sl_pattern = '#(%s)#';
    } 
    if ($options & 'html') {
        $pattern = '#(?!<.*?)(%s)(?![^<>]*?>)#';
        $sl_pattern = '#<a\s(?:.*?)>(%s)</a>#';
    }

并像这样使用它:

str_highlight($html,'homepage','html|wholeword|striplinks','<a href="http://www.example.com">Homepage</a>');