以下代码一直有效,直到最后一行:
writer.writerow([Title, Views, Likes, Dislikes, datetime.now()])
我收到此错误:
NameErrorTraceback (most recent call last)
<ipython-input-45-1666dd7f773b> in <module>()
----> 1 writer.writerow([Title, Views, Likes, Dislikes, datetime.now()])
NameError: name 'Title' is not defined
有人能告诉我发生了什么吗???? (将python2与Jupyter一起使用)
from urllib2 import urlopen
from bs4 import BeautifulSoup
import pandas as pd
import re
import requests
myurl = 'https://www.youtube.com/results?search_query=sports'
page = requests.get("https://www.youtube.com/results?search_query=sports")
page.status_code
soup = BeautifulSoup(page.content, 'html.parser')
soup
link_list = []
for link in soup.select('div.yt-lockup-content a[href^=/watch]'):
newLink = link.attrs.get('href')
link_list.append(newLink)
print(link_list[0:6])
index = range(len(link_list))
columns = ['Links', 'Title', 'Views', 'Likes', 'Dislikes']
df = pd.DataFrame(index=index, columns=columns)
df['Links'] = link_list
df['Links'] = 'https://www.youtube.com/' + df['Links']
for i in range(len(link_list)):
if i<5 or i>len(link_list)-5:
print('{} out of {}'.format(i, len(link_list)))
html = urlopen(df['Links'][i]).read()
from bs4 import BeautifulSoup
soup = BeautifulSoup(html.decode('utf-8', 'ignore'))
df['Title'][i] = soup.title.get_text()
df['Views'][i] = int(re.sub('[^0-9]', '',soup.select('.watch-view-count')[0].get_text().split()[0]))
df['Title'][i] = soup.title.get_text()
df['Views'][i] = int(re.sub('[^0-9]', '', soup.select('.watch-view-count')[0].get_text().split()[0]))
a = str(soup.find_all('button', \
attrs={'title': 'I like this'})).replace(",","")
df['Likes'][i] = float(re.findall(r'\d+', a)[0])
a = str(soup.find_all('button', \
attrs={'title': 'I dislike this'})).replace(",","")
df['Dislikes'][i] = float(re.findall(r'\d+', a)[0])
print(df.sort_values('Views', ascending = False).head())
print(df.sort_values('Views', ascending = False))
with open('youtube.csv', 'a') as csv_file:
writer = csv.writer(csv_file)
writer.writerow([Title, Views, Likes, Dislikes, datetime.now()])
答案 0 :(得分:0)
在第
行writer.writerow([Title, Views, Likes, Dislikes, datetime.now()])
您使用的是Title
而不是"Title"
。 Python将此解释为Go find the variable Title and put its value here
。当你真正想要做的是传递字符串值"Title"
。您需要将Title
,Views
,Likes
和Dislikes
全部放在引号中。