我正在跟踪此存储库https://github.com/DataSnaek/Trending-YouTube-Scraper,以在youtube上抓取热门视频。
我已经正确配置了国家代码和API密钥。但是,我想将视频持续时间添加到我的数据文件中。我搜索了Youtube API并尝试了此编码(添加了一些与contentDetails和工期有关的代码):
...
# Used to identify columns, currently hardcoded order
header = ["video_id"] + snippet_features + ["trending_date", "tags", "length", "view_count", "likes", "dislikes",
"comment_count", "thumbnail_link", "comments_disabled",
"ratings_disabled", "description"]
...
def api_request(page_token, country_code):
# Builds the URL and requests the JSON from it
request_url = f"https://www.googleapis.com/youtube/v3/videos?part=id,statistics,contentDetails,snippet{page_token}chart=mostPopular&hl=vi®ionCode={country_code}&maxResults=50&key={api_key}"
request = requests.get(request_url)
if request.status_code == 429:
print("Temp-Banned due to excess requests, please wait and continue later")
sys.exit()
return request.json()
...
# Snippet, statistics and contentDetails are sub-dicts of video, containing the most useful info
snippet = video['snippet']
statistics = video['statistics']
contentdetails = video['contentDetails']
...
# The following are special case features which require unique processing, or are not within the snippet dict
description = snippet.get("description", "")
thumbnail_link = snippet.get("thumbnails", dict()).get("default", dict()).get("url", "")
length = contentdetails.get("duration", "")
trending_date = time.strftime("%y.%d.%m")
tags = get_tags(snippet.get("tags", ["[none]"]))
view_count = statistics.get("viewCount", 0)
...
if 'duration' in contentdetails:
length = contentdetails['duration']
else:
length = "0"
# Compiles all of the various bits of info into one consistently formatted line
line = [video_id] + features + [prepare_feature(x) for x in
[trending_date, tags, length, view_count, likes, dislikes,
comment_count, thumbnail_link, comments_disabled,
ratings_disabled, description]]
lines.append(",".join(line))
return lines
...
但是实际输出只是标题:
video_id,标题,publishedAt,channelId,channelTitle,categoryId,trending_date,标签,长度,view_count,喜欢,不喜欢,comment_count,缩略图链接,comments_disabled,ratings_disabled,说明
非常感谢您!