用PHP替换文本中的链接

时间:2012-06-07 08:48:57

标签: php regex

我使用以下RegEx将文本中的链接替换为可点击链接:

preg_replace('/(http)+(s)?:(\/\/)((\w|\.)+)(\/)?(\S+)?/i', '<a href="\0" target="_blank" class="lgray">\0</a>',$message);

我需要一个新的,它将识别仅以www开头的链接以及具有http的链接。这是所需网址类型的列表:

我自己试图这样做,但我在RegEx-s中表现不佳。将不胜感激任何帮助。

谢谢!

P.S: stackoverflow也无法识别仅以www开头的网址。

4 个答案:

答案 0 :(得分:0)

在你的正则表达式中,你必须使用冒号和两个斜杠。

这一行应该补救:

preg_replace('/(http|https)?(:)?(\/\/)?((\w|\.)+)(\/)?(\S+)?/i', '<a href="\0" target="_blank" class="lgray">\0</a>',$domains);

要获得更好的答案,请尝试查看Regular expression pattern to match url with or without http://www

答案 1 :(得分:0)

使用Claus Witt的链接并修改它只是做了一点工作。他给的preg_replace虽然没有用。这是我做的:

$regex = "(((https?|ftp)\:\/\/)|(www))";//Scheme
$regex .= "([a-z0-9-.]*)\.([a-z]{2,4})";//Host or IP
$regex .= "(\:[0-9]{2,5})?";//Port
$regex .= "(\/([a-z0-9+\$_-]\.?)+)*\/?";//Path
$regex .= "(\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)?";//GET Query
$regex .= "(#[a-z_.-][a-z0-9+\$_.-]*)?";//Anchor
return str_replace
(
    array('href="','http://http://','http://https://','http:///'),
    array('href="http://','http://','https://','/'),
    preg_replace('/'.$regex.'/i','<a href="\0" target="_blank" class="lgray">\0</a>',$message)
);

在修改中,我需要 http www ,删除了一些不必要的检查,并将域扩展名从3个字符扩展到4个字符(.info是一个域名也是如此。

答案 2 :(得分:0)

免责声明:这些是非常基本的,不会考虑检查有效的TLD或文件扩展名。使用风险自负。

假设您不需要考虑目录或文件,只匹配没有子域的基本网址,you can use the following regex

(?<=^|[\n\s])(?:https?:\/\/)?(?:www\.)?[a-zA-Z0-9-.]+\.com\/?(?=$|[\n\s])
#DESCRIPTION::
#  (?<=^|[\n\s])           Checks to see that what's preceding the URL is the beginning of the string, or a newline, or whitespace.
#  (?:https?:\/\/)?        Matches http(s) if it is there
#  (?:www\.)?              Matches www. if it is there
#  [a-zA-Z0-9-]+           Matches "example" in "example.com" (as well as any other valid URL character; will also match subdomains)
#  \.com\/?                Matches .com(/)
#  (?=$|[\n\s])            Checks to see that what's following the URL is the end of the string, or a newline, or whitespace.

如果您还需要匹配目录和文件,the end of the regex needs to be modified and added to slightly

(?<=^|[\n\s])(?:https?:\/\/)?(?:www\.)?[a-zA-Z0-9-.]+\.com(?:(?:\/[\w]+)+)?(?:\/|\.[\w]+)?(?=$|[\n\s])
#DESCRIPTION::
#  (?<=^|[\n\s])           Checks to see that what's preceding the URL is the beginning of the string, or a newline, or whitespace.
#  (?:https?:\/\/)?        Matches http(s) if it is there
#  (?:www\.)?              Matches www. if it is there
#  [a-zA-Z0-9-.]+          Matches "example" in "example.com" (as well as any other valid URL character; will also match subdomains)
#  \.com                   Matches .com
#  (?:                     Start of a group
#     (?:\/[\w]+)+         Attempts to find subdirectories by matching /, then word characters
#  )?                      Ends the previous group. This group can be skipped, if there are no subdirectories
#  (?:\/|\.[\w]+)?         Matches a file extension if it is there, or a / if it is there.
#  (?=$|[\n\s])            Checks to see that what's following the URL is the end of the string, or a newline, or whitespace.

答案 3 :(得分:-1)

试试这个:

$pattern = preg_replace("/((https:\/\/|http:\/\/||http:\/\/www.|https:\/\/www.|www.)+([\w\/])+(.com\/|.com))/i","<a target=\"_blank\" href=\"$1\">$1</a>",$url);