我正在使用此功能搜索某个网站,以查看我对某件商品是否感兴趣。它首先从页面抓取html,然后搜索我感兴趣的项目。当它找到该项时,它会将以下几行(由rangenumber指定)添加到变量' endresult'。然后它会在endresult中搜索关键字(" sale"),此时我想通知我关键字是否存在。
当我打印endresult时,输出包含关键字,但函数最后的if语句总是返回"关键字丢失"尽管如此,我无法解决原因。
def bargainscraper(self, website, item, keyword,rangenum):
request = urllib.request.Request(website)
response = urllib.request.urlopen(request)
data = response.read()
html = str(data)
data1 = html2text.html2text(html)
fw = open('result1.txt', 'w')
fw.write(str(data1))
fw.close()
with open('result1.txt', 'r') as f:
for line in f:
if item in line:
for x in range(rangenum):
endresult = str(f.readline())
print (endresult)
if keyword in endresult:
print("keyword is present")
else:
print("keyword is missing")
答案 0 :(得分:0)
可能需要连接endresult
而不是用以下内容覆盖它:endresult += str(f.readline())
注意" +"在" ="。
答案 1 :(得分:0)
我发现将endresult写入for循环中的文件,然后在for循环之外搜索该文件以获取关键字是我正在寻找的答案:
def bargainscraper(self, website, item, keyword,rangenum):
request = urllib.request.Request(website)
response = urllib.request.urlopen(request)
data = response.read()
html = str(data)
data1 = html2text.html2text(html)
fw = open('result1.txt', 'w')
fw.write(str(data1))
fw.close()
with open('result1.txt', 'r') as f:
for line in f:
if item in line:
for x in range(rangenum):
endresult = str(f.readline())
# the 'a' switch is used to append
with open('result2.txt', 'a') as n:
n.write(endresult)
# This is outside of the for loop as otherwise it will iterate for each line of the rangenum
if keyword in open('result2.txt').read():
print ("keyword is present")
else:
print ("keyword is missing")