Question

我有一个文本文件，其中有1个数字。我正在尝试将其添加到网址上，并将第一个数字附加到网址上，获取信息，然后依次移动到下一个网址，提取信息等。如果该数字显示空白页，则应该结束序列并通过电子邮件发送其收集的信息。我没有任何错误。它完成了它的运行，但是我没有得到任何回报，也没有看到文本文件中数字的任何变化。我很好奇这个程序的这一部分是否正确，或者我缺少什么。

这就是我得到的。

http://www.wvlabor.com/new_searches/contractor_RESULTS.cfm?wvnumber=WV {}＆contractor_name =＆dba =＆city_name =＆County =＆Submit3 = Search + Contractors'

#loads LIC# url
def get_page(license_number):
    url = URL_FORMAT.format(license_number)
    r = requests.get(url)
    return bs(r.text, 'lxml')

#looks for non-existent info for no-license
def license_exists(soup):
    if soup.find('td', class_ = 'style3'):
        return True
    else:
        return False

#pulls lic# from text license_number.txt
def get_current_license_number():
    with open(LICENSE_NUMBER_FILE, 'r') as f:
        return int(f.read())

#adds lic# to urls
def get_new_license_pages(curr_license_num):
    new_pages = []
    more = True
    curr_license_num +=1

    return new_pages

使用Beautifulsoup构建顺序网址

0 个答案: