刮痧和保存tripadvisor评论 - 在python 3上使用beautifulsoup,get_text()

时间:2016-03-13 15:06:11

标签: python beautifulsoup

星期天快乐!

免责声明:我非常喜欢python(只用了两周时间学习绳索)。

是的,所以我使用beautifulsoup构建了一小段代码来废弃tripadvisor页面中的评论内容并将其保存到文本文件中。

我的问题是,当我打印结果时,会显示所有评论。但是,当我尝试将其保存到本地文本文件时,只保存第一个评论。

这是我到目前为止的一段代码:

#prompt for URL of the page to scrap
print "                                Paste url here"
importurl = raw_input()
print "                                Import of:"
print "                                %s " %importurl
#convert the page into a soup
r = requests.get(importurl)
soup = BeautifulSoup(r.content, "lxml")
#look for the partial entry of the review
resultsoup = soup.find_all("p", {"class" : "partial_entry"})
#save the reviews to a test text file locally
for review in resultsoup:
    review_list = review.get_text()
    print review_list
    with open('testreview.txt', 'w') as fid:
        fid.write(unidecode(review_list))

1 个答案:

答案 0 :(得分:0)

在循环的每次迭代中重写文件。将上下文管理器移动到循环之前:

with open('testreview.txt', 'w') as fid: 
    for review in resultsoup:
        review_list = review.get_text()
        fid.write(unidecode(review_list))