Question

是否可以在不降低页面加载速度的情况下从页面网址获取页面描述？通过java脚本或PHP或任何语言？

例如，我会发送此输入：

http://www.facebook.com

并获得此输出：

Facebook is a social utility that connects people with friends and others who work, study and live around them. People use Facebook to keep up with friends, ...

我怎么做？

Answer 1

您需要file_get_contents($url)功能。如需更多帮助，请参阅this

1：http://php.net/manual/en/function.file-get-contents.php。如果URL包含一些空格，您可能需要urlencode。至于解析部分，我在网上找到了一些代码。 Here是链接。请知道

代码：

 <?php
function getMetaTitle($content){
//echo "AAAAA".$content;
$pattern = "|<[\s]*title[\s]*>([^<]+)<[\s]*/[\s]*title[\s]*>|Ui";
if(preg_match($pattern, $content, $match))
{
    //echo $match[1];
    return $match[1];
}
else
    return false;
}   
    //echo "<h1>Hello World!</h1>";
$url = "your url here";

$str = file_get_contents($url);

$title1 = getMetaTitle($str);
echo $title1;
//echo htmlentities($str);
?>

Answer 2

我希望类似的功能可以创建一个类似Facebook的功能，并获取标题，描述和图像。我使用了DOMDocument，所以即使你可以尝试DOMDocument来解析页面。根据HTML标记或属性解析HTML页面非常有用。

通过结合使用ajax（通过在您的域上保留您的PHP脚本），您可以将URL传递给PHP脚本（类似于下面的内容），这反过来将从网站上获取所需的详细信息。

示例代码：

$url = ''; // this will be your URL
$doc = new DOMDocument();
// added @ to suppress the errors
@$doc->loadHTMLFile($url);

foreach($doc->getElementsByTagName('title') as $title)
{
   $arrDetails['title'] = $title->nodeValue;
}

Answer 3

file_get_contents（$ url）然后解析标记或任何描述。然后将几个url - description保存到本地缓存中，以避免连续请求页面。

从页面URL获取页面描述，而不会减慢页面加载速度

3 个答案: