我正在编写一个脚本,它将抓取一个页面并找到可采用的狗的名字。我能够将名称删除并附加到列表中。但是,我无法连续运行代码并将新名称附加到新列表并将其从旧列表中删除。想知道是否有人可以帮我解决这个问题。
import requests
from bs4 import BeautifulSoup
import re
import time
from twilio.rest import Client
url = 'http://petharbor.com/results.asp?searchtype=ADOPT&start=3%20&friends=1&samaritans=1&nosuccess=0&rows=25&imght=200&imgres=thumb&tWidth=200&view=sysadm.v_chmp&bgcolor=b7b7b7&text=ffffff&link=ffffff&alink=4400ff&vlink=ffffff&fontface=arial&fontsize=12&col_hdr_bg=000066&col_hdr_fg=ffffff&SBG=000066&zip=61802&miles=10&shelterlist=%27CHMP%27&atype=&where=type_DOG&PAGE=1'
response = requests.get(url)
html = response.content
account_sid = ("XXXXXXXXXXXXXXXXXXXXXXXXXXX")
auth_token = ("XXXXXXXXXXXXXXXXXXXXXXXXXXXX")
client = Client(account_sid, auth_token)
soup = BeautifulSoup(html, 'html.parser')
names = soup.find_all(text=re.compile("My name is(.*)"))
def check():
old = []
new = []
newest = []
for name in names:
name = name.title()
if name not in old:
old.append(name[11:-2])
if name in old:
continue
for name in names:
name = name.title()
if name in old:
continue
if name not in new and name not in old:
new.append(name[11:-2])
if name not in new and name in old:
new.append(name[11:-2])
old.remove(name)
if name in new and name in old:
old.remove(name)
new.remove(name)
for name in names:
name = name.title()
if name in old or name in new:
continue
if name not in old and name not in new:
newest.append(name[11:-2])
num_old = len(old)
num_new = len(new)
num_newest = len(newest)
print("Old List: " + str(old))
print("Number of dogs in the old list: " + str(num_old))
print("New List: " + str(new))
print("Number of new dogs: " + str(num_new))
print("Newest List: " + str(newest))
print("Number of newest dogs: " + str(num_newest))
#client.api.account.messages.create(to = "+XXXXXXXXXX",
#from_= "+XXXXXXXXXX",
#body = "Here are some new dogs:" + str(new))
#client.api.account.messages.create(to="+XXXXXXXXXX",
#from_="+XXXXXXXXXX",
#body=("There are " + str(num_newest) + " new puppies"), media_url = 'http://petharbor.com/results.asp?searchtype=ADOPT&start=3%20&friends=1&samaritans=1&nosuccess=0&rows=25&imght=200&imgres=thumb&tWidth=200&view=sysadm.v_chmp&bgcolor=b7b7b7&text=ffffff&link=ffffff&alink=4400ff&vlink=ffffff&fontface=arial&fontsize=12&col_hdr_bg=000066&col_hdr_fg=ffffff&SBG=000066&zip=61802&miles=10&shelterlist=%27CHMP%27&atype=&where=type_DOG&PAGE=1')
#client.api.account.messages.create(to = "+XXXXXXXXXX",
#from_= "+XXXXXXXXXX",
#body = "Here are some new names:" + str(newest))
while True:
check()
time.sleep(20)
这是当前的输出:
Old List: ['Pretty', 'Celia', 'Khloe', 'Duke', 'Evangeline', 'Thelma', 'Clara', 'Carly', 'Camille', 'Maxine', 'Jupiter', 'Pixie', 'Smiley', 'Mia', 'Pogo', 'Rosco', 'Clark', 'Ellie', 'Marcy', 'Jimmy', 'Willie', 'Layla']
Number of dogs in the old list: 22
New List: ['Pretty', 'Celia', 'Khloe', 'Duke', 'Evangeline', 'Thelma', 'Clara', 'Carly', 'Camille', 'Maxine', 'Jupiter', 'Pixie', 'Smiley', 'Mia', 'Pogo', 'Rosco', 'Clark', 'Ellie', 'Marcy', 'Jimmy', 'Willie', 'Layla']
Number of new dogs: 22
我试图让它更新为:
Old List: ['Pretty', 'Celia', 'Khloe', 'Duke', 'Evangeline', 'Thelma', 'Clara', 'Carly', 'Camille', 'Maxine', 'Jupiter', 'Pixie', 'Smiley', 'Mia', 'Pogo', 'Rosco', 'Clark', 'Ellie', 'Marcy', 'Jimmy', 'Willie', 'Layla']
Number of dogs in the old list: 22
New List: ['Max', 'Charlie']
Number of new dogs: 2
我会尝试重述我想要做的事情: