Question

您好我正在制作一个PHP脚本，以便从Youtube结果中提取视频网址。我有这个：

<?php
    error_reporting(1);

    function conseguir_codigo_url($url) {
        $dwnld = curl_init();
        curl_setopt($dwnld, CURLOPT_URL, $url);
        curl_setopt($dwnld, CURLOPT_HEADER, 0);
        //$userAgent = 'Mozilla/4.0 (compatible; MSIE 6.01; Windows NT 6.0)';
        curl_setopt($dwnld, CURLOPT_USERAGENT, $userAgent);
        curl_setopt($dwnld, CURLOPT_RETURNTRANSFER, true);

        $fuente_url = curl_exec($dwnld);
        curl_close($dwnld);
        return $fuente_url;
    }

    function extraer_atributo_elemento($fuente) {
        $file = new DOMDocument;

        if($file->loadHTML($fuente) and $file->validate()){

            echo "DOCUMENTO";

            $file->getElementById("search-results");

        }

     $codigo_url = conseguir_codigo_url("http://www.youtube.com/results?search_sort=video_date_uploaded&uni=3&search_type=videos&search_query=humor");
    extraer_atributo_elemento($codigo_url);
?>

问题是我无法使用getelementbyid，我认为它可能是 html5 。你有解决这个问题的建议吗？我需要解析源码，我不知道正则表达式。所以domdocument是唯一的方法。

Answer 1

为什么使用$file->validate()？如果您只想按ID提取元素，则无需调用它。此外，在调用DOMDocument::recover之前将true设置为loadHTML可能有助于从网络中解析损坏的HTML。

无法在php脚本中加载youtube源。

1 个答案: