在HTML内查找纯文本链接

Question

我有一个简单的评论系统，人们可以在纯文本字段中提交超链接。当我将这些记录从数据库显示回网页时，我可以使用PHP中的RegExp将这些链接转换为HTML类型的锚链接吗？

我不希望算法使用任何其他类型的链接执行此操作，只需http和https。

Answer 1

这是另一个解决方案，这将捕获所有http / https / www并转换为可点击链接。

$url = '~(?:(https?)://([^\s<]+)|(www\.[^\s<]+?\.[^\s<]+))(?<![\.,:])~i'; 
$string = preg_replace($url, '<a href="$0" target="_blank" title="$0">$0</a>', $string);
echo $string;

或者只是为了捕获http / https，然后使用下面的代码。

$url = '/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/';   
$string= preg_replace($url, '<a href="$0" target="_blank" title="$0">$0</a>', $string);
echo $string;

编辑：下面的脚本将捕获所有url类型并将其转换为可单击的链接。

$url = '@(http)?(s)?(://)?(([a-zA-Z])([-\w]+\.)+([^\s\.]+[^\s]*)+[^,.\s])@';
$string = preg_replace($url, '<a href="http$2://$4" target="_blank" title="$0">$0</a>', $string);
echo $string;

新的更新，如果你有字符串条带（s）然后使用下面的代码块，感谢@AndrewEllis指出这一点。

$url = '@(http(s)?)?(://)?(([a-zA-Z])([-\w]+\.)+([^\s\.]+[^\s]*)+[^,.\s])@';
$string = preg_replace($url, '<a href="http$2://$4" target="_blank" title="$0">$0</a>', $string);
echo $string;

这是一个非常简单的解决方案，无法正确显示网址。

$email = '<a href="mailto:email@email.com">email@email.com</a>';
$string = $email;
echo $string;

这是一个非常简单的修复方法，但您必须根据自己的目的对其进行修改。

Answer 2

嗯，Volomike的答案更接近。并且为了进一步推动它，这就是我为它做的事情，忽略超链接末尾的尾随期间。我也考虑过URI片段。

public static function makeClickableLinks($s) {
  return preg_replace('@(https?://([-\w\.]+[-\w])+(:\d+)?(/([\w/_\.#-]*(\?\S+)?[^\.\s])?)?)@', '<a href="$1" target="_blank">$1</a>', $s);
}

Answer 3

<?
function makeClickableLinks($text)
{

        $text = html_entity_decode($text);
        $text = " ".$text;
        $text = eregi_replace('(((f|ht){1}tp://)[-a-zA-Z0-9@:%_\+.~#?&//=]+)',
                '<a href="\\1" target=_blank>\\1</a>', $text);
        $text = eregi_replace('(((f|ht){1}tps://)[-a-zA-Z0-9@:%_\+.~#?&//=]+)',
                '<a href="\\1" target=_blank>\\1</a>', $text);
        $text = eregi_replace('([[:space:]()[{}])(www.[-a-zA-Z0-9@:%_\+.~#?&//=]+)',
        '\\1<a href="http://\\2" target=_blank>\\2</a>', $text);
        $text = eregi_replace('([_\.0-9a-z-]+@([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3})',
        '<a href="mailto:\\1" target=_blank>\\1</a>', $text);
        return $text;
}

// Example Usage
echo makeClickableLinks("This is a test clickable link: http://www.websewak.com  You can also try using an email address like test@websewak.com");
?>

Answer 4

参考http://zenverse.net/php-function-to-auto-convert-url-into-hyperlink/。这就是wordpress如何解决它

function _make_url_clickable_cb($matches) {
    $ret = '';
    $url = $matches[2];

    if ( empty($url) )
        return $matches[0];
    // removed trailing [.,;:] from URL
    if ( in_array(substr($url, -1), array('.', ',', ';', ':')) === true ) {
        $ret = substr($url, -1);
        $url = substr($url, 0, strlen($url)-1);
    }
    return $matches[1] . "<a href=\"$url\" rel=\"nofollow\">$url</a>" . $ret;
}

function _make_web_ftp_clickable_cb($matches) {
    $ret = '';
    $dest = $matches[2];
    $dest = 'http://' . $dest;

    if ( empty($dest) )
        return $matches[0];
    // removed trailing [,;:] from URL
    if ( in_array(substr($dest, -1), array('.', ',', ';', ':')) === true ) {
        $ret = substr($dest, -1);
        $dest = substr($dest, 0, strlen($dest)-1);
    }
    return $matches[1] . "<a href=\"$dest\" rel=\"nofollow\">$dest</a>" . $ret;
}

function _make_email_clickable_cb($matches) {
    $email = $matches[2] . '@' . $matches[3];
    return $matches[1] . "<a href=\"mailto:$email\">$email</a>";
}

function make_clickable($ret) {
    $ret = ' ' . $ret;
    // in testing, using arrays here was found to be faster
    $ret = preg_replace_callback('#([\s>])([\w]+?://[\w\\x80-\\xff\#$%&~/.\-;:=,?@\[\]+]*)#is', '_make_url_clickable_cb', $ret);
    $ret = preg_replace_callback('#([\s>])((www|ftp)\.[\w\\x80-\\xff\#$%&~/.\-;:=,?@\[\]+]*)#is', '_make_web_ftp_clickable_cb', $ret);
    $ret = preg_replace_callback('#([\s>])([.0-9a-z_+-]+)@(([0-9a-z-]+\.)+[0-9a-z]{2,})#i', '_make_email_clickable_cb', $ret);

    // this one is not in an array because we need it to run last, for cleanup of accidental links within links
    $ret = preg_replace("#(<a( [^>]+?>|>))<a [^>]+?>([^>]+?)</a></a>#i", "$1$3</a>", $ret);
    $ret = trim($ret);
    return $ret;
}

Answer 5

对于我来说，评分最高的答案并没有完成这项工作，因为链接未被正确替换：

http://www.fifa.com/worldcup/matches/round255951/match=300186487/index.html#nosticky

经过一些谷歌搜索和一些测试，这就是我提出的：

public static function replaceLinks($s) {
    return preg_replace('@(https?://([-\w\.]+)+(:\d+)?(/([\w/_\.%-=#]*(\?\S+)?)?)?)@', '<a href="$1">$1</a>', $s);
}

我不是正则表达式的专家，实际上它让我很困惑：）

请随意评论并改进此解决方案。

Answer 6

public static function makeClickableLinks($s) {
    return preg_replace('@(https?://([-\w\.]+)+(:\d+)?(/([\w/_\.-]*(\?\S+)?)?)?)@', '<a href="$1">$1</a>', $s);
}

Answer 7

MkVal的答案有效但在我们已经有锚链接的情况下，它将以奇怪的格式呈现文本。

以下是适用于我的解决方案：

$s = preg_replace ( 
    "/(?<!a href=\")(?<!src=\")((http|ftp)+(s)?:\/\/[^<>\s]+)/i",
    "<a href=\"\\0\" target=\"blank\">\\0</a>",
    $s
);

Answer 8

这是我的代码，用于格式化文本中的所有链接，包括带有和不带协议的电子邮件，网址。

n.floatValue
Float(truncating: n)

请评论不起作用的网址。我会尝试更新正则表达式。

Answer 9

我建议不要像这样在飞行中做很多事情。我更喜欢使用像stackoverflow中使用的简单编辑器界面。它被称为Markdown。

Answer 10

我使用的是一个源自question2answer的函数，它在html中接受纯文本甚至纯文本链接：

// $html holds the string
$htmlunlinkeds = array_reverse(preg_split('|<[Aa]\s+[^>]+>.*</[Aa]\s*>|', $html, -1, PREG_SPLIT_OFFSET_CAPTURE)); // start from end so we substitute correctly
foreach ($htmlunlinkeds as $htmlunlinked)
{ // and that we don't detect links inside HTML, e.g. <img src="http://...">
    $thishtmluntaggeds = array_reverse(preg_split('/<[^>]*>/', $htmlunlinked[0], -1, PREG_SPLIT_OFFSET_CAPTURE)); // again, start from end
    foreach ($thishtmluntaggeds as $thishtmluntagged)
    {
        $innerhtml = $thishtmluntagged[0];
        if(is_numeric(strpos($innerhtml, '://'))) 
        { // quick test first
            $newhtml = qa_html_convert_urls($innerhtml, qa_opt('links_in_new_window'));
            $html = substr_replace($html, $newhtml, $htmlunlinked[1]+$thishtmluntagged[1], strlen($innerhtml));
        }
    }
}   
echo $html;

function qa_html_convert_urls($html, $newwindow = false)
/*
    Return $html with any URLs converted into links (with nofollow and in a new window if $newwindow).
    Closing parentheses/brackets are removed from the link if they don't have a matching opening one. This avoids creating
    incorrect URLs from (http://www.question2answer.org) but allow URLs such as http://www.wikipedia.org/Computers_(Software)
*/
{
    $uc = 'a-z\x{00a1}-\x{ffff}';
    $url_regex = '#\b((?:https?|ftp)://(?:[0-9'.$uc.'][0-9'.$uc.'-]*\.)+['.$uc.']{2,}(?::\d{2,5})?(?:/(?:[^\s<>]*[^\s<>\.])?)?)#iu';

    // get matches and their positions
    if (preg_match_all($url_regex, $html, $matches, PREG_OFFSET_CAPTURE)) {
        $brackets = array(
            ')' => '(',
            '}' => '{',
            ']' => '[',
        );

        // loop backwards so we substitute correctly
        for ($i = count($matches[1])-1; $i >= 0; $i--) {
            $match = $matches[1][$i];
            $text_url = $match[0];
            $removed = '';
            $lastch = substr($text_url, -1);

            // exclude bracket from link if no matching bracket
            while (array_key_exists($lastch, $brackets)) {
                $open_char = $brackets[$lastch];
                $num_open = substr_count($text_url, $open_char);
                $num_close = substr_count($text_url, $lastch);

                if ($num_close == $num_open + 1) {
                    $text_url = substr($text_url, 0, -1);
                    $removed = $lastch . $removed;
                    $lastch = substr($text_url, -1);
                }
                else
                    break;
            }

            $target = $newwindow ? ' target="_blank"' : '';
            $replace = '<a href="' . $text_url . '" rel="nofollow"' . $target . '>' . $text_url . '</a>' . $removed;
            $html = substr_replace($html, $replace, $match[1], strlen($match[0]));
        }
    }

    return $html;
}

由于接受包含括号和其他字符的链接，因此可能会有所帮助。

Answer 11

试试这个：

<a href="<?php echo Mage::getUrl('about-us'); ?>">About Us</a>

它会跳过现有的链接（如果我们已经有一个href，它将不会在href中添加一个href）。否则它会添加一个空白目标的href。

Answer 12

cl /std:c++14 so.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 19.13.26131.1 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

so.cpp
C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\VC\Tools\MSVC\14.13.26128\include\xlocale(313): warning C4530: C++ exception handler used, but unwind semantics are not enabled. Specify /EHsc
so.cpp(7): error C2146: syntax error: missing ')' before identifier 'or'
so.cpp(7): error C2065: 'or': undeclared identifier
so.cpp(7): error C2146: syntax error: missing ';' before identifier 'i'
so.cpp(7): error C2059: syntax error: ')'
so.cpp(8): error C2143: syntax error: missing ';' before 'std::cout'
so.cpp(7): warning C4553: '==': operator has no effect; did you intend '='?

结果：

$string = 'example.com
www.example.com
http://example.com
https://example.com
http://www.example.com
https://www.example.com';

preg_match_all('#(\w*://|www\.)[a-z0-9]+(-+[a-z0-9]+)*(\.[a-z0-9]+(-+[a-z0-9]+)*)+(/([^\s()<>;]+\w)?/?)?#i', $string, $matches, PREG_OFFSET_CAPTURE | PREG_SET_ORDER);
foreach (array_reverse($matches) as $match) {
  $a = '<a href="'.(strpos($match[1][0], '/') ? '' : 'http://') . $match[0][0].'">' . $match[0][0] . '</a>';
  $string = substr_replace($string, $a, $match[0][1], strlen($match[0][0]));
}

echo $string;

此解决方案中我喜欢它还会将example.com <a href="http://www.example.com">www.example.com</a> <a href="http://example.com">http://example.com</a> <a href="https://example.com">https://example.com</a> <a href="http://www.example.com">http://www.example.com</a> <a href="https://www.example.com">https://www.example.com</a>转换为www.example.com，因为http://www.example.com无效（没有<a href="www.example.com"></a>协议指向http/https 1}}）。

Answer 13

在HTML内查找纯文本链接

我真的很喜欢this answer-但是我需要一种解决方案，以解决非常简单的HTML文本中可能存在的纯文本链接：

<p>I found a really cool site you might like:</p>
<p>www.stackoverflow.com</p>

这意味着我需要使用正则表达式模式来忽略html字符<和>

正则表达式调整

因此，我将部分模式更改为[^\s\>\<]而不是\S

\S-不是空格；匹配任何非空格字符（制表符，空格，换行符）
[^]-否定集；匹配集合中没有的任何字符

我从此答案得到的函数版本

除了HTML之外，我还需要另一种格式，因此我将正则表达式与替代品分开以适应这种情况。

我还添加了一种只将找到的链接/电子邮件返回到数组中的方法，这样我就可以将它们另存为关系（在以后为它们制作元卡的伟大之处……并且进行分析！）。

更新：连续时间段匹配

我正在与there...it之类的文本匹配-所以我想确保没有任何包含连续点的匹配。

注意：要解决此问题，我添加了一个额外的格式字符串以撤消与它们的匹配，以避免不得不重做这些本来可靠的url正则表达式。

/***
 * based on this answer: https://stackoverflow.com/a/49689245/2100636
 *
 * @var $text String
 * @var $format String - html (<a href=""...), short ([link:https://somewhere]), other (https://somewhere)
 */
public function formatLinksInString(
    $string,
    $format = 'html', 
    $returnMatches = false
) {
    $formatProtocol = $format == 'html'
        ? '<a href="$0" target="_blank" title="$0">$0</a>'
        : ($format == 'short' || $returnMatches ? '[link:$0]' : '$0');

    $formatSansProtocol = $format == 'html'
        ? '<a href="//$0" target="_blank" title="$0">$0</a>'
        : ($format == 'short' || $returnMatches ? '[link://$0]' : '$0');

    $formatMailto = $format == 'html'
        ? '<a href="mailto:$1" target="_blank" title="$1">$1</a>'
        : ($format == 'short' || $returnMatches ? '[mailto:$1]' : '$1');

    $regProtocol = '/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,}(\/[^\<\>\s]*)?/';
    $regSansProtocol = '/(?<=\s|\A|\>)([0-9a-zA-Z\-\.]+\.[a-zA-Z0-9\/]{2,})(?=\s|$|\,|\<)/';
    $regEmail = '/([^\s\>\<]+\@[^\s\>\<]+\.[^\s\>\<]+)\b/';
    $consecutiveDotsRegex = $format == 'html'
        ? '/<a[^\>]+[\.]{2,}[^\>]*?>([^\<]*?)<\/a>/'
        : '/\[link:.*?\/\/([^\]]+[\.]{2,}[^\]]*?)\]/';

    // Protocol links
    $formatString = preg_replace($regProtocol, $formatProtocol, $string);
    // Sans Protocol Links
    $formatString = preg_replace($regSansProtocol, $formatSansProtocol, $formatString); // use formatString from above
    // Email - Mailto - Links
    $formatString = preg_replace($regEmail, $formatMailto, $formatString); // use formatString from above
    // Prevent consecutive periods from getting captured
    $formatString = preg_replace($consecutiveDotsRegex, '$1', $formatString);

    if ($returnMatches) {
        // Find all [x:link] patterns
        preg_match_all('/\[.*?:(.*?)\]/', $formatString, $matches);

        current($matches); // to move pointer onto groups
        return next($matches); // return the groups
    }

    return $formatString;
}

Answer 14

如果我是对的，你想要做的是将普通文本转换为http链接。以下是我认为可以提供的帮助：

<?php

   $list = mysqli_query($con,"SELECT * FROM list WHERE name = 'table content'"); 
   while($row2 = mysqli_fetch_array($list)) {
echo "<a target='_blank' href='http://www." . $row2['content']. "'>" . $row2['content']. "</a>";

   }  
?>

在PHP中将纯文本URL转换为HTML超链接

14 个答案:

在HTML内查找纯文本链接

正则表达式调整

我从此答案得到的函数版本

更新：连续时间段匹配