htmlspecialchars并使链接可点击

时间:2016-04-15 10:05:31

标签: php htmlspecialchars

我有一个处理用户输入的PHP脚本。我需要转义所有特殊字符,但也可以使链接可点击(将它们转换为<a>元素)。我需要的是:

function specialCharsAndLinks($text) {
    // magic goes here
}
$inp = "http://web.page/index.php?a1=hi&a2=hello\n<script src=\"http://bad-website.com/exploit.js\"></script>";
$out = specialCharsAndLinks($inp);
echo $out;

输出应为(HTML格式):

<a href="http://web.page/index.php?a1=hi&a2=hello">http://web.page/index.php?a1=hi&amp;a2=hello</a>
&lt;script src="http://bad-website.com/exploit.js"&gt;&lt;/script&gt;

请注意,链接中的amperstand保留在href属性中,但在链接的实际内容中转换为&amp;

在浏览器中查看时:

http://web.page/index.php?a1=hi&a2=hello &lt; script src =&#34; http://bad-website.com/exploit.js"&gt;&lt; / script&gt;

2 个答案:

答案 0 :(得分:0)

Try this:

$urlEscaped = htmlspecialchars("http://web.page/index.php?a1=hi&a2=hello");
$aTag = '<a href="$urlEscaped">Hello</a>';
echo $aTag;

Your example doesn't work because if escaping whole html tag, a tag will never get processed by the browser, instead it will just display as plain text.

As you can see, stackoverflow escapes our whole input (questions/answers ...), so we can actually see the code, and not letting browser to process it.

答案 1 :(得分:0)

I eventually solved it with:

function process_text($text) {
    $text = htmlspecialchars($text);
    $url_regex = "/(?:http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+(?:\/\S*)?/";
    $text = preg_replace_callback($url_regex, function($matches){
        return '<a href="'.htmlspecialchars_decode($matches[0]).'" rel="nofollow">'.$matches[0]."</a>";
    }, $text);
    return $text;
}

The first line html-encodes the input.
The second line defines the URL regex. Could be improved, but working for now.
The 3rd line uses preg_replace_callback, a function which is like preg_replace, but instead of supplying it with a replacement string, you supply a replacement function that returns the replacement string.
The 4th line is the actual function. It's quite self-documenting. htmlspecialchars_decode undoes the actions of htmlspecialchars (therefore making the link valid if it contained an amperstand).