我正在使用python脚本扫描YouTube视频,查看已对视频发表评论并将其用户名写入文件的用户名。
我正在使用youtube API,当我打印comment_entry的整个响应时,我能够获得评论作者。
有没有办法隔离用户名?
例如,输入9bZkp7q19f0(Gangnam Style)作为video_id将产生(在第一个注释的值集中):
<?xml version='1.0' encoding='UTF-8'?>
<ns0:entry xmlns:ns0="http://www.w3.org/2005/Atom" xmlns:ns1="http://gdata.youtube.com/schemas/2007"><ns0:category scheme="http://schemas.google.com/g/2005#kind" term="http://gdata.youtube.com/schemas/2007#comment" /><ns0:id>http://gdata.youtube.com/feeds/api/videos/9bZkp7q19f0/comments/LZQPQhLyRh9VQaVtT18UUKqLpyWBdytJ7B-JRTu0cf8</ns0:id><ns0:author><ns0:name>THAsweatyGamer</ns0:name><ns0:uri>https://gdata.youtube.com/feeds/api/users/THAsweatyGamer</ns0:uri></ns0:author><ns0:content type="text">sometimes but not always</ns0:content><ns0:updated>2013-05-17T12:30:27.000Z</ns0:updated><ns0:published>2013-05-17T12:30:27.000Z</ns0:published><ns0:title type="text">sometimes but not ...</ns0:title><ns0:link href="https://gdata.youtube.com/feeds/api/videos/9bZkp7q19f0?client=TJNP_YT_BOT" rel="related" type="application/atom+xml" /><ns0:link href="https://www.youtube.com/watch?v=9bZkp7q19f0" rel="alternate" type="text/html" /><ns0:link href="https://gdata.youtube.com/feeds/api/videos/9bZkp7q19f0/comments/LZQPQhLyRh-b_np-G6TRfbDU8xlXaRcR_qXeRfla_vo?client=TJNP_YT_BOT" rel="http://gdata.youtube.com/schemas/2007#in-reply-to" type="application/atom+xml" /><ns0:link href="https://gdata.youtube.com/feeds/api/videos/9bZkp7q19f0/comments/LZQPQhLyRh9VQaVtT18UUKqLpyWBdytJ7B-JRTu0cf8?client=TJNP_YT_BOT" rel="self" type="application/atom+xml" /><ns1:videoid>9bZkp7q19f0</ns1:videoid></ns0:entry>
我想隔离 <ns0:author><ns0:name>THAsweatyGamer</ns0:name><ns0:uri>https://gdata.youtube.com/feeds/api/users/THAsweatyGamer</ns0:uri></ns0:author>
以将用户名写入文件。使用comment_entry.author得到:
[<atom.Author object at 0x02CE5B50>]
[<atom.Author object at 0x02CE5EB0>]
[<atom.Author object at 0x02CED230>]
[<atom.Author object at 0x02CED5B0>]
[<atom.Author object at 0x02CED910>]
[<atom.Author object at 0x02CEDCD0>]
[<atom.Author object at 0x02CF6070>]
[<atom.Author object at 0x02CF63D0>]
[<atom.Author object at 0x02CF6750>]
[<atom.Author object at 0x02CF6B10>]
[<atom.Author object at 0x02CF6E90>]
[<atom.Author object at 0x03591210>]
[<atom.Author object at 0x03591590>]
[<atom.Author object at 0x03591950>]
[<atom.Author object at 0x03591CD0>]
[<atom.Author object at 0x0359B050>]
[<atom.Author object at 0x0359B3D0>]
[<atom.Author object at 0x0359B750>]
[<atom.Author object at 0x0359BAD0>]
[<atom.Author object at 0x0359BE50>]
[<atom.Author object at 0x035A31D0>]
[<atom.Author object at 0x035A3530>]
[<atom.Author object at 0x035A3890>]
[<atom.Author object at 0x035A3BF0>]
我的脚本(到目前为止)是:
import gdata.youtube
import gdata.youtube.service
yt_service = gdata.youtube.service.YouTubeService()
yt_service.ssl = True
yt_service.developer_key = #mykey
yt_service.client_id = #myclientid
yt_service.source = #myclientid
video_id = raw_input("Enter the video's ID")
comment_feed = yt_service.GetYouTubeVideoCommentFeed(video_id= video_id)
for comment_entry in comment_feed.entry:
print comment_entry.author
答案 0 :(得分:0)
您需要使用XML解析器来提取您要查找的数据。这是一个使用Python的Element Tree XML API:
的简单示例import xml.etree.ElementTree as ET
tree = ET.parse('youtube.xml')
root = tree.getroot()
print root[2][0].text
print root[2][1].text
这为您提供以下输出:
THAsweatyGamer
https://gdata.youtube.com/feeds/api/users/THAsweatyGamer
注意:上面示例代码中的youtube.xml
是一个包含您在问题中包含的YouTube XML输出的文件。