youtube视频评论下载使用python

时间:2018-06-10 21:37:58

标签: python

"使用json文件"

获取youtube视频评论
import simplejson as json
from urllib.request import urlopen
import sys
import time
import csv
import os
import io
os.chdir(r'C:\Users\adity\Desktop\data science')
csvFile =open('test1.csv',"w")
#csvFile =open('test.tsv',"w")
#writer = csv.writer(csvFile,delimiter=',')
#writer.writerow('Comments')
csvFile.write("comments\n")
STAGGER_TIME = 1 

# open the url and the screen name 
# (The screen name is the screen name of the user for whom to return results for)
url = "https://www.googleapis.com/youtube/v3/commentThreads?key=AIzaSyCYkTUjKgFGcKDnkNQMgSBbb4obnqIzUEM&textFormat=plainText&part=snippet&videoId=Ye8mB6VsUHw&maxResults=100"

"这需要一个python对象并将其转储到一个JSON字符串  该对象的表示"

url1=urlopen(url)
#data = json.load(urllib2.urlopen(url))
result = json.load(url1)


# print the result
itemList= result.get("items")
length=len(itemList)

for i in range(0,length):
    results= (result["items"][i].get('snippet').get("topLevelComment").get('snippet').get("textDisplay")).encode("utf-8")
    print(results)
    results=results.replace(",", "")
    #print (result["items"][i].get('snippet').get("topLevelComment").get('snippet').get("textDisplay")).encode("utf-8")
    #writer.writerow((result["items"][i].get('snippet').get("topLevelComment").get('snippet').get("textDisplay")).encode("utf-8"))
    csvFile.write(results)
    csvFile.write('\n')
    time.sleep(STAGGER_TIME)

csvFile.close()

"收到错误:TypeError:需要类似字节的对象,而不是' str"

TypeError                                 Traceback (most recent call last)
<ipython-input-112-a5225431e178> in <module>()
     32         results= (result["items"][i].get('snippet').get("topLevelComment").get('snippet').get("textDisplay")).encode("utf-8")
     33         print(results)
---> 34         results=results.replace(",", "")
     35         #print (result["items"][i].get('snippet').get("topLevelComment").get('snippet').get("textDisplay")).encode("utf-8")
     36         #writer.writerow((result["items"][i].get('snippet').get("topLevelComment").get('snippet').get("textDisplay")).encode("utf-8"))

TypeError: a bytes-like object is required, not 'str'

1 个答案:

答案 0 :(得分:0)

results= (result["items"][i].get('snippet').get("topLevelComment").get('snippet').get("textDisplay")).encode("utf-8")

这里的责任在于最后一部分.encode("utf-8"),它将字符串转换为字节,这很好,除非您尝试使用常规字符串replace。建议(最适合你的):

选项1如果可以,只需从行

中删除该部分即可
results = result["items"][i].get('snippet').get("topLevelComment").get('snippet').get("textDisplay")

选项2在尝试decode之前添加replace

results = results.decode().replace(",", "")

选项3使用replace和正确的字节:

results = results.replace(b",", b"")

选项1是理想选项,因为它更简单,并且与其余代码更兼容(首先不需要转换为字节,它没有任何目的我可以看到)