从php中的字符串中提取确切的URL

时间:2018-09-25 09:39:04

标签: php

我需要使用php从字符串中提取所有URL,我在url以下引用了该URL,但没有得到我想要的确切结果。 Reference url 和我的字符串在下面,

$string = "hi new image one http://xxx/images/c4ca4238a0b923820dcc509a6f75849b208754572.jpgand two arehttp://yyy/images/c1f1a611c1147c4054c399c01f8bad76686484492.jpgend";
$regex = '#\bhttps?://[^,\s()<>]+(?:\([\w\d]+\)|([^,[:punct:]\s]|/))#';
preg_match_all($regex, $string, $matches);
echo "<pre>";
print_r($matches[0]); 

我得到的结果是

Array
(
    [0] => http://xxx/images/c4ca4238a0b923820dcc509a6f75849b208754572.jpgand
)

它仅显示一个结果,但是在字符串2url中可用,是否有可能得到以下结果,

Array
    (
        [0] => http://xxx/images/c4ca4238a0b923820dcc509a6f75849b208754572.jpg
        [1] => http://yyy/images/c1f1a611c1147c4054c399c01f8bad76686484492.jpg
    )

如何删除URL开头和结尾的附加文本并从字符串中过滤确切的URL?任何帮助

3 个答案:

答案 0 :(得分:1)

问题在于您正在将链接与http单词的boundary相匹配

$regex = '#\bhttps?://[^,\s()<>]+(?:\([\w\d]+\)|([^,[:punct:]\s]|/))#';
//         ^^ note this

省略边界将获得您字符串中网址的完整列表

$regex = '#https?://[^,\s()<>]+(?:\([\w\d]+\)|([^,[:punct:]\s]|/))#';

将输出:

Array (
    [0] => http://xxx/images/c4ca4238a0b923820dcc509a6f75849b208754572.jpgend
    [1] => http://yyy/images/c1f1a611c1147c4054c399c01f8bad76686484492.jpgand
)

应该匹配网址末尾的某个固定后缀。

我将假定您要与jpg,jpeg,png图像匹配,因此您的图案可能如下所示:

$regex = '#https?://[^,\s()<>]+(?:\([\w\d]+\)|([^,[:punct:]\s]|/)\.(jpg|jpeg|png))#';

实时示例:https://3v4l.org/WACo1

答案 1 :(得分:0)

您可以进行for循环。 与数组的大小$ matches 然后打印结果。

<?php
$string = "hi new image one http://xxx/images/c4ca4238a0b923820dcc509a6f75849b208754572.jpgand two are http://yyy/images/c1f1a611c1147c4054c399c01f8bad76686484492.jpgend";
$regex = '#\bhttps?://[^,\s()<>]+(?:\([\w\d]+\)|([^,[:punct:]\s]|/))#';

preg_match_all($regex, $string, $matches);
echo "<pre>";

for($i=0;$i<sizeof($matches);$i++){
    print_r($matches[$i]); 
}

尝试一下,让我知道它是否符合您的需求

答案 2 :(得分:0)

这是您问题的答案

    $string = "hi new image one http://xxx/images/c4ca4238a0b923820dcc509a6f75849b208754572.jpg  and two are http://yyy/images/c1f1a611c1147c4054c399c01f8bad76686484492.jpg end";
    $strArray = explode(' ', $string);
    $newString = "";
    $url = array();
    foreach($strArray as $word)
    {
      if (substr($word, 0, 7) == "http://" || substr($word, 0, 8) == "https://")
      {
        $url[] = $word;
      } else {
        if ($newString != '')
          $newString .= ' ';
        $newString .= $word;
      }
    }

     print_r($url);