我想过滤输入文本,如果它里面有一个URL。
我是指每个与有效互联网地址相对应的内容,例如www.example.com
,example.com
,http://www.example.com
,http://example.com/foo/bar
。
我想我必须使用正则表达式和preg_match
函数,所以我需要正确的正则表达式模式。
如果有人能给我那个,我会非常感激。
答案 0 :(得分:2)
本文有一个很好的匹配网址的正则表达式:http://daringfireball.net/2010/07/improved_regex_for_matching_urls
对于PHP,您需要正确地转义正则表达式,例如:
$text = "here is some text that contains a link to www.example.com, and it will be matched.";
preg_match("/(?i)\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’]))/", $text, $matches);
var_dump($matches);
答案 1 :(得分:1)
$html = "http://www.scroogle.org
http://www.scroogle.org/
http://www.scroogle.org/index.html
http://www.scroogle.org/index.html?source=library
You can surf the internet anonymously at https://ssl.scroogle.org/cgi-bin/nbbwssl.cgi.";
preg_match_all('/\b((?P<protocol>https?|ftp):\/\/(?P<domain>[-A-Z0-9.]+)(?P<file>\/[-A-Z0-9+&@#\/%=~_|!:,.;]*)?(?P<parameters>\?[A-Z0-9+&@#\/%=~_|!:,.;]*)?)/i', $html, $urls, PREG_PATTERN_ORDER);
$urls = $urls[1][0];
匹配:
<强> http://www.scroogle.org 强>
<强> http://www.scroogle.org/ 强>
<强> http://www.scroogle.org/index.html 强>
<强> http://www.scroogle.org/index.html?source=library 强>
您可以通过 https://ssl.scroogle.org/cgi-bin/nbbwssl.cgi 匿名上网。
要循环结果,您可以使用:
for ($i = 0; $i < count($urls[0]); $i++) {
echo $urls[1][$i]."\n";
}
将输出:
http://www.scroogle.org
http://www.scroogle.org/
http://www.scroogle.org/index.html
http://www.scroogle.org/index.html?source=library
https://ssl.scroogle.org/cgi-bin/nbbwssl.cgi
答案 2 :(得分:1)
在此处找到:http://zenverse.net/php-function-to-auto-convert-url-into-hyperlink/
WordPress的功能。
function _make_url_clickable_cb($matches) {
$ret = '';
$url = $matches[2];
if ( empty($url) )
return $matches[0];
// removed trailing [.,;:] from URL
if ( in_array(substr($url, -1), array('.', ',', ';', ':')) === true ) {
$ret = substr($url, -1);
$url = substr($url, 0, strlen($url)-1);
}
return $matches[1] . "<a href=\"$url\" rel=\"nofollow\">$url</a>" . $ret;
}
function _make_web_ftp_clickable_cb($matches) {
$ret = '';
$dest = $matches[2];
$dest = 'http://' . $dest;
if ( empty($dest) )
return $matches[0];
// removed trailing [,;:] from URL
if ( in_array(substr($dest, -1), array('.', ',', ';', ':')) === true ) {
$ret = substr($dest, -1);
$dest = substr($dest, 0, strlen($dest)-1);
}
return $matches[1] . "<a href=\"$dest\" rel=\"nofollow\">$dest</a>" . $ret;
}
function _make_email_clickable_cb($matches) {
$email = $matches[2] . '@' . $matches[3];
return $matches[1] . "<a href=\"mailto:$email\">$email</a>";
}
function make_clickable($ret) {
$ret = ' ' . $ret;
// in testing, using arrays here was found to be faster
$ret = preg_replace_callback('#([\s>])([\w]+?://[\w\\x80-\\xff\#$%&~/.\-;:=,?@\[\]+]*)#is', '_make_url_clickable_cb', $ret);
$ret = preg_replace_callback('#([\s>])((www|ftp)\.[\w\\x80-\\xff\#$%&~/.\-;:=,?@\[\]+]*)#is', '_make_web_ftp_clickable_cb', $ret);
$ret = preg_replace_callback('#([\s>])([.0-9a-z_+-]+)@(([0-9a-z-]+\.)+[0-9a-z]{2,})#i', '_make_email_clickable_cb', $ret);
// this one is not in an array because we need it to run last, for cleanup of accidental links within links
$ret = preg_replace("#(<a( [^>]+?>|>))<a [^>]+?>([^>]+?)</a></a>#i", "$1$3</a>", $ret);
$ret = trim($ret);
return $ret;
}
$string = 'I have some texts here and also links such as http://www.youtube.com , www.haha.com and lol@example.com. They are ready to be replaced.';
echo make_clickable($string);