Question

以下正则表达式适用于大多数网址。但是在少量网址的情况下，它不会给出标题，但源代码有标题。

$data = file_get_contents($url);
$title = get_title($data);
echo $title;
function get_title($html) 
    {
        return preg_match('!<title>(.*?)</title>!i', $html, $matches) ? $matches[1] : '';
    }

以下是演示：DEMO

Answer 1

作为问题的解决方案，请尝试执行以下代码段

<?php
 $url= 'http://www.indianic.com';
 $ch = curl_init();
 curl_setopt($ch, CURLOPT_URL, $url);
 curl_setopt($ch, CURLOPT_HEADER, 0);
 curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
 $result=curl_exec($ch);
 curl_close($ch); 
 //echo $result;  
$title = get_title($result);
 echo $title;
function get_title($html) 
{
            return preg_match('!<title>(.*?)</title>!i', $html, $matches) ? $matches[1] : '';
}
?>

Answer 2

试试这个，

 return preg_match('/<title[^>]*>(.*?)<\\/title>/ims', $html, $matches) ? $matches[1] : '';

检查并正常工作

$url='http://www.ndtv.com/';
    $data = file_get_contents($url);
    $title = preg_match('/<title[^>]*>(.*?)<\/title>/ims', $data, $matches) ? $matches[1] : '';
    echo $title;

输出： - NDTV.com：印度，商业，宝莱坞，板球，视频和最新消息

正则表达式从网页源中提取标题

2 个答案: