在一些在线评论(Bucky)的帮助下,我设法编写了一个简单的网络刮刀,只检查网页上是否有某些文字。但是,我想要做的是让代码每小时运行一次。我假设我还需要托管代码,所以它这样做?我已经完成了一些研究,但似乎无法找到每小时运行它的正确方法。 这是我到目前为止的代码:
import requests
from bs4 import BeautifulSoup
def odeon_spider(max_pages):
page = 1
while page <= max_pages:
url = "http://www.odeon.co.uk/films/rogue_one_a_star_wars_story/16038/" + str(page) #stores url in variable
source_code = requests.get(url) #gets url and sets it as source_code variable
plain_text = source_code.text #stores plain text in plain_text variable
soup = BeautifulSoup(plain_text, "lxml") #create beautifulsoup object
div_content = soup.findAll("div", {"class": "textComponent"}) #finds all divs with specific class
for x in div_content:
find_para = str(x.find('p').text) #finds all paragraphs and stores them in variable
text_to_search = "Register to be notified" #set text to search to variable
if text_to_search in find_para: #checks if text is in find_para
print("No tickets")
else:
print("Tickets")
page += 1
odeon_spider(1)
谢谢!
答案 0 :(得分:4)
最简单的方法是:
import time
while True:
call_your_function()
time.sleep(3600)
如果您想在Linux上执行此操作,只需键入
即可nohup python -u your_script_name &
然后你的脚本将作为一个进程运行。(如果你没有杀死它,它只是在没有挂断的情况下继续运行。)