Question

我正在做一个涉及Python中的网页抓取的项目。我正在使用BeautifulSoup和Selenium库。在我的脚本中，我正在收集所有带有链接文本的URL标记。

link_click = driver.find_elements_by_link_text(
    'Fall/Winter 2017-2018 Course Schedule')

然后，我想知道我创建的列表中实际有多少链接。

len(link_click)

以下是我需要稍后运行的功能。

def get_course_info():
    url = driver.current_url
    response = requests.get(url, headers = headers)
    soup = BeautifulSoup(response.content, 'html.parser')
    course = soup.find_all('p')
    course_code = print(course[0].text[3:][:9])
    course_cat = print(course[0].text[3:][:4])
    course_name = print(course[0].text[22:])
    course_desc = print(course[3].text)
    results = soup.findAll("td", {"valign": "TOP","width" : "15%"})[1::2]
    list_of_inner_text = [x.text for x in results]
    final = list(set(list_of_inner_text))
    instructors = ', '.join(final)
    print(instructors)

我使用这段代码来执行我需要对链接执行的操作。

course_select = link_click[1].click();
get_course_info()

我的问题是我目前从link_click的索引传入1到函数。我想循环遍历它，以便所有33个链接都可以传递到course_select和get_course_info函数。

Answer 1

如果它只是你想要实现的循环，那么它就是：

links = driver.find_elements_by_link_text('Fall/Winter 2017-2018 Course Schedule')

if len(links) >= 1:
    for link in links:
        link.click()
        time.sleep(2) #wait for page load
        get_course_info()
else:
    print ("No links found")

Python - 迭代列表中的非整数项

1 个答案: