我正在使用beautifulsoup来评论评论。我有刮削部分,准备将我的代码写入csv文件。在线查看很多示例,我仍然不了解如何写入csv文件。我的抓取代码是
for i in range(0,200,5):
url = "https://www.tripadvisor.com/Hotel_Review-g39143-d92240-Reviews-or" + str(i) + "-Hawthorn_Suites_by_Wyndham_Wichita_East-Wichita_Kansas"
headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.90 Safari/537.36'}
response = requests.get(url, headers=headers, verify=False).text
soup = BeautifulSoup(response, "lxml")
reviews = soup.find_all('div', 'reviewSelector')
for r in reviews:
print("Rating: ", int(r.find('span','ui_bubble_rating')['class'][1].split('_')[1])/10)
print("Review snipet: ", r.p.text)
要写入csv,我尝试将代码包装在csv.writer方法
中with open('TA-reviews.csv', 'w', newline='') as csvfile:
writer = csv.writer(csvfile, delimiter=',', quotechar='"')
for i in range(0,200,5):
url = "https://www.tripadvisor.com/Hotel_Review-g39143-d92240-Reviews-or" + str(i) + "-Hawthorn_Suites_by_Wyndham_Wichita_East-Wichita_Kansas"
headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.90 Safari/537.36'}
response = requests.get(url, headers=headers, verify=False).text
soup = BeautifulSoup(response, "lxml")
reviews = soup.find_all('div', 'reviewSelector')
for r in reviews:
print("Rating: ", int(r.find('span','ui_bubble_rating')['class'][1].split('_')[1])/10)
print("Review snipet: ", r.p.text)
writer.writerow((rating, review))
返回错误,评级未定义,但打印出一个评级
答案 0 :(得分:1)
返回未定义评级的错误
当然xy[:,0]
未定义。您的代码中的哪个位置有一个语句将任何名称绑定到名称rating
?
但打印出一个评级
您打印出来的是表达式rating
。这不会定义任何int(r.find('span','ui_bubble_rating')['class'][1].split('_')[1])/10
变量。
你想:
rating