Question

我使用以下RegEx将文本中的链接替换为可点击链接：

preg_replace('/(http)+(s)?:(\/\/)((\w|\.)+)(\/)?(\S+)?/i', '<a href="\0" target="_blank" class="lgray">\0</a>',$message);

我需要一个新的，它将识别仅以www开头的链接以及具有http的链接。这是所需网址类型的列表：

我自己试图这样做，但我在RegEx-s中表现不佳。将不胜感激任何帮助。

谢谢！

P.S： stackoverflow也无法识别仅以www开头的网址。

Answer 1

在你的正则表达式中，你必须使用冒号和两个斜杠。

这一行应该补救：

preg_replace('/(http|https)?(:)?(\/\/)?((\w|\.)+)(\/)?(\S+)?/i', '<a href="\0" target="_blank" class="lgray">\0</a>',$domains);

要获得更好的答案，请尝试查看Regular expression pattern to match url with or without http://www

Answer 2

使用Claus Witt的链接并修改它只是做了一点工作。他给的preg_replace虽然没有用。这是我做的：

$regex = "(((https?|ftp)\:\/\/)|(www))";//Scheme
$regex .= "([a-z0-9-.]*)\.([a-z]{2,4})";//Host or IP
$regex .= "(\:[0-9]{2,5})?";//Port
$regex .= "(\/([a-z0-9+\$_-]\.?)+)*\/?";//Path
$regex .= "(\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)?";//GET Query
$regex .= "(#[a-z_.-][a-z0-9+\$_.-]*)?";//Anchor
return str_replace
(
    array('href="','http://http://','http://https://','http:///'),
    array('href="http://','http://','https://','/'),
    preg_replace('/'.$regex.'/i','<a href="\0" target="_blank" class="lgray">\0</a>',$message)
);

在修改中，我需要 http 或 www ，删除了一些不必要的检查，并将域扩展名从3个字符扩展到4个字符（.info是一个域名也是如此。

Answer 3

免责声明：这些是非常基本的，不会考虑检查有效的TLD或文件扩展名。使用风险自负。

假设您不需要考虑目录或文件，只匹配没有子域的基本网址，you can use the following regex：

(?<=^|[\n\s])(?:https?:\/\/)?(?:www\.)?[a-zA-Z0-9-.]+\.com\/?(?=$|[\n\s])

#DESCRIPTION::
#  (?<=^|[\n\s])           Checks to see that what's preceding the URL is the beginning of the string, or a newline, or whitespace.
#  (?:https?:\/\/)?        Matches http(s) if it is there
#  (?:www\.)?              Matches www. if it is there
#  [a-zA-Z0-9-]+           Matches "example" in "example.com" (as well as any other valid URL character; will also match subdomains)
#  \.com\/?                Matches .com(/)
#  (?=$|[\n\s])            Checks to see that what's following the URL is the end of the string, or a newline, or whitespace.

如果您还需要匹配目录和文件，the end of the regex needs to be modified and added to slightly：

(?<=^|[\n\s])(?:https?:\/\/)?(?:www\.)?[a-zA-Z0-9-.]+\.com(?:(?:\/[\w]+)+)?(?:\/|\.[\w]+)?(?=$|[\n\s])

#DESCRIPTION::
#  (?<=^|[\n\s])           Checks to see that what's preceding the URL is the beginning of the string, or a newline, or whitespace.
#  (?:https?:\/\/)?        Matches http(s) if it is there
#  (?:www\.)?              Matches www. if it is there
#  [a-zA-Z0-9-.]+          Matches "example" in "example.com" (as well as any other valid URL character; will also match subdomains)
#  \.com                   Matches .com
#  (?:                     Start of a group
#     (?:\/[\w]+)+         Attempts to find subdirectories by matching /, then word characters
#  )?                      Ends the previous group. This group can be skipped, if there are no subdirectories
#  (?:\/|\.[\w]+)?         Matches a file extension if it is there, or a / if it is there.
#  (?=$|[\n\s])            Checks to see that what's following the URL is the end of the string, or a newline, or whitespace.

Answer 4

试试这个：

$pattern = preg_replace("/((https:\/\/|http:\/\/||http:\/\/www.|https:\/\/www.|www.)+([\w\/])+(.com\/|.com))/i","<a target=\"_blank\" href=\"$1\">$1</a>",$url);

用PHP替换文本中的链接

4 个答案: