我试图让这个功能再次运行,如果它没有在页面上找到信息。
我认为这将是一个解决方案,但它不起作用。我不确定如何通过简单的功能实现刮擦循环。我尝试使用重试模块,但它在安装时遇到问题,因此硬编码解决方案将是理想的。
我的代码在
下面import time, requests, webbrowser, sys, os, re, json
from bs4 import BeautifulSoup
from colorama import Fore, Back, Style, init
import subprocess as s
url = "http://notimportant.com"
r = requests.get(url)
soup = BeautifulSoup(r.content, "html.parser")
def getIds():
global product_id
for script in scripts:
if 'spConfig =' in script.getText():
#idlive = True
regex = re.compile(r'var spConfig = new Product.Config\((.*?)\);')
match = regex.search(script.getText())
spConfig = json.loads(match.groups()[0])
for key, attribute in spConfig['attributes'].iteritems():
for option in attribute['options']:
if option['label_uk'] == size:
label = option['label_uk'].strip()
for product_id in option['products']:
print(Fore.CYAN + "Size Found!")
print product_id, "-", label
#str = product_id
#productsizeid = str
else:
print(Fore.RED + "Sizes not live yet")
print("Retrying in 10 seconds . . .")
time.sleep(10)
print("Trying again. . .")
getIds()
答案 0 :(得分:0)
迭代将是首选方法 类似的东西:
url = "http://notimportant.com"
size_alive = false
while not size_alive:
do_the_scraping_function(#the function should set size_alive=true when it finds spConfig =' in script.getText())
print("retrying in 10 seconds")
time.sleep(10)