Google Drive API - 获取文档大纲

时间:2016-10-05 07:15:06

标签: google-drive-api google-docs-api

在Google文档中,您可以查看并浏览文档大纲。我正在尝试通过Google Drive API访问此大纲,但我无法找到相关文档。这是我现在的代码:

    //authenticate
    $this->authenticate();

    $Service = new Google_Service_Drive($this->Client);
    $File = $Service->files->get($FileID);

    return $File;

我得到了文档对象,但我找不到任何返回大纲的函数。我需要大纲链接才能从我的应用程序访问文档的特定部分。任何想法如何才能实现?

2 个答案:

答案 0 :(得分:1)

File.get返回file resource所有文件资源只是文件的元数据。它有关存储在谷歌驱动器上的文件的信息。

您需要在某些文档应用程序中加载它以查找任何大纲链接。元数据不包含与文件中存储的数据有关的任何内容。

答案 1 :(得分:0)

我终于解决了这个问题,DaImTo指向了正确的方向。获得file resource后,我使用它来获取我的文档的HTML代码的导出链接,然后我使用该链接通过Google_Http_Request检索该文档的HTML内容。 (本部分为Google documentation

public function retrive_file_outline($FileID) {
    //authenticate
    $this->authenticate();

    $Service = new Google_Service_Drive($this->Client);
    $File = $Service->files->get($FileID);

    $DownloadUrl = $File->getExportLinks()["text/html"];

    if ($DownloadUrl) {
        $Request = new Google_Http_Request($DownloadUrl, 'GET', null, null);
        $HttpRequest = $Service->getClient()->getAuth()->authenticatedRequest($Request);
        if ($HttpRequest->getResponseHttpCode() == 200) {
            return array($File, $HttpRequest->getResponseBody());
        } else {
            // An error occurred.
            return null;
        }
    } else {
        // The file doesn't have any content stored on Drive.
        return null;
    }
}

之后,我使用DOMDocument解析HTML内容。所有标头都有id属性,用作anchor链接。我检索了所有标题(h1到h6)的id,并将其与我的文档编辑网址连接起来。这给了我所有的大纲链接。这是解析和连接部分:

public function test($FileID) {
    $File = $this->model_google->retrive_file_outline($FileID);

    $DOM = new DOMDocument;
    $DOM->loadHTML($File[1]);

    $TagNames = ["h1", "h2", "h3", "h4", "h5", "h6"];
    foreach($TagNames as $TagName) {
        $Items = $DOM->getElementsByTagName($TagName);
        foreach($Items as $Item) {
            $ID = $Item->attributes->getNamedItem("id");
            echo "<a target='_blank' href='" . $File[0]->alternateLink ."#heading=". $ID->nodeValue . "'>" . $Item->nodeValue . "</a><br />";
        }
    }
    //echo $File;
}

修改 我将函数retrieve_file_outline和test合并到retrieve_file_outline中,我得到了一个函数,它返回带有链接和id的文档标题数组:

public function retrive_file_outline($FileID) {
    //authenticate
    $this->authenticate();

    $Service = new Google_Service_Drive($this->Client);
    $File = $Service->files->get($FileID);

    $DownloadUrl = $File->getExportLinks()["text/html"];

    if ($DownloadUrl) {
        $Request = new Google_Http_Request($DownloadUrl, 'GET', null, null);
        $HttpRequest = $Service->getClient()->getAuth()->authenticatedRequest($Request);
        if ($HttpRequest->getResponseHttpCode() == 200) {
            $DOM = new DOMDocument;
            $DOM->loadHTML($HttpRequest->getResponseBody());

            $TagNames = ["h1", "h2", "h3", "h4", "h5", "h6"];
            $Headings = array();
            foreach($TagNames as $TagName) {
                $Items = $DOM->getElementsByTagName($TagName);
                foreach($Items as $Item) {
                    $ID = $Item->attributes->getNamedItem("id");
                    $Heading = array(
                        "link" => $File->alternateLink . "#heading=" . $ID->nodeValue,
                        "heading_id" => $ID->nodeValue,
                        "title" => $Item->nodeValue
                    );

                    array_push($Headings, $Heading);
                }
            }

            return $Headings;
        } else {
            // An error occurred.
            return null;
        }
    } else {
        // The file doesn't have any content stored on Drive.
        return null;
    }
}