在输出不在第二个列表中的列表中的项目时遇到以下问题。
代码如下:
def getInitialList(): # Define initial list with the use of requests and BS, will return a set
getInHtml = requests.get("http://127.0.0.1")
parseInHtml = BeautifulSoup(getInHtml.content, "html.parser")
processInHtml = parseInHtml.find_all("div", class_="inner-article")
firstList = []
for items in processInHtml:
firstList.append(items)
return firstList
def getSecList(): #Define second list with the use of requests and BS, will return a set
getHtml = requests.get("http://127.0.0.1")
parseHtml = BeautifulSoup(getHtml.content, "html.parser")
processHtml = parseHtml.find_all("div", class_="inner-article")
secList = []
for items in processHtml:
secList.append(items)
return secList
def catch_new_item():
initList = getInitialList()
while True:
if initList == getSecList():
print("No new items")
else:
print("New items found")
break
secList = getSecList()
return set(secList) - set(initList)
最后一个函数(catch_new_items())应该返回secList中的内容,而不是initList中的内容,但是当我运行它时,它将返回一个空集。
地址127.0.0.1是本地Web服务器,该服务器正在运行以确定这2个项目之间的差异。我要做的就是编辑html并在其中添加一个元素。
让我知道您的想法吗?
答案 0 :(得分:0)
我已经以这种方式修改了代码,以对其进行调试:
def getInitialList(): # Define initial list with the use of requests and BS, will return a set
firstList = ['1', '2', '3']
return firstList
def getSecList(): #Define second list with the use of requests and BS, will return a set
secList = ['a', 'b', '3', '1']
return secList
def catch_new_item():
initList = getInitialList()
while True:
if initList == getSecList():
print("No new items")
else:
print("New items found")
break
secList = getSecList()
return set(secList) - set(initList)
print(catch_new_item())
并返回:
New items found
{'a'}
所以项目检测的逻辑很好。