Question

我正在跟踪此存储库https://github.com/DataSnaek/Trending-YouTube-Scraper，以在youtube上抓取热门视频。

我已经正确配置了国家代码和API密钥。但是，我想将视频持续时间添加到我的数据文件中。我搜索了Youtube API并尝试了此编码（添加了一些与contentDetails和工期有关的代码）：

...

# Used to identify columns, currently hardcoded order
header = ["video_id"] + snippet_features + ["trending_date", "tags", "length", "view_count", "likes", "dislikes",
                                            "comment_count", "thumbnail_link", "comments_disabled",
                                            "ratings_disabled", "description"]

...

def api_request(page_token, country_code):
    # Builds the URL and requests the JSON from it
    request_url = f"https://www.googleapis.com/youtube/v3/videos?part=id,statistics,contentDetails,snippet{page_token}chart=mostPopular&hl=vi&regionCode={country_code}&maxResults=50&key={api_key}"
    request = requests.get(request_url)
    if request.status_code == 429:
        print("Temp-Banned due to excess requests, please wait and continue later")
        sys.exit()
    return request.json()

...
        # Snippet, statistics and contentDetails are sub-dicts of video, containing the most useful info
        snippet = video['snippet']
        statistics = video['statistics']
        contentdetails = video['contentDetails']

...
        # The following are special case features which require unique processing, or are not within the snippet dict
        description = snippet.get("description", "")
        thumbnail_link = snippet.get("thumbnails", dict()).get("default", dict()).get("url", "")
        length = contentdetails.get("duration", "")
        trending_date = time.strftime("%y.%d.%m")
        tags = get_tags(snippet.get("tags", ["[none]"]))
        view_count = statistics.get("viewCount", 0)
...
        if 'duration' in contentdetails:
            length = contentdetails['duration']
        else:
            length = "0"

        # Compiles all of the various bits of info into one consistently formatted line
        line = [video_id] + features + [prepare_feature(x) for x in
                                        [trending_date, tags, length, view_count, likes, dislikes,
                                         comment_count, thumbnail_link, comments_disabled,
                                         ratings_disabled, description]]
        lines.append(",".join(line))
    return lines

...

但是实际输出只是标题：

video_id，标题，publishedAt，channelId，channelTitle，categoryId，trending_date，标签，长度，view_count，喜欢，不喜欢，comment_count，缩略图链接，comments_disabled，ratings_disabled，说明

非常感谢您！

如何确定YouTube视频数据的抓取时间？

0 个答案: