如何从页面获取域URL

时间:2013-10-02 01:08:50

标签: php html preg-match

我正在尝试从html页面获取准确的域名网址

我尝试此网址仅从v.html返回

https://picasaweb.google.com/114948445121686813006/DropBox?authkey=Gv1sRgCMLjxpef1rHJ3QE#5929911272604125650

但我的php函数显示所有网址

v.html有html代码和链接

这是我的PHP代码

<?php


$string=file_get_contents("v.html");

function getUrls($string)
{
    $regex = '/https?\:\/\/[^\" ]+/i';
    preg_match_all($regex, $string, $matches);
    return ($matches[0]);
}

 $urls = getUrls($string);

 foreach($urls as $url)
 {
    echo $url.'<br />';
 }


?>

输出

http://www.w3.org/2007/app
http://schemas.google.com/photos/2007
http://www.w3.org/2005/Atom
http://purl.org/atom/app#
http://www.w3.org/2007/app
http://schemas.google.com/photos/2007
http://www.w3.org/2005/Atom
http://purl.org/atom/app#
http://www.w3.org/2007/app
http://www.w3.org/2005/Atom
http://purl.org/atom/app#
http://www.w3.org/2007/app
https://picasaweb.google.com/114948445121686813006/DropBox?authkey=Gv1sRgCMLjxpef1rHJ3QE#5929911272604125650

2 个答案:

答案 0 :(得分:0)

在输出中使用PHP函数strpos()。 随着他搜索“https”并找到您的网址

$findme   = 'https://';
$pos = strpos($output, $findme);

// Note our use of ===.  Simply == would not work as expected
if ($pos === false) {
    // not found in the output
} else {
    // found in the output
}

或试试这个:

foreach($urls as $url) { 
   if (strstr($url,'picasaweb.google.com')) { 
      echo $url; 
   } 
}

答案 1 :(得分:-1)

使用JavaScript,您可以使用

获取

this.document.location.href