Question

这是我上一个问题＆＃34; Web scrapping using selenium and beautifulsoup.. trouble in parsing and selecting button＆＃34;的延续。我可以解决上一个问题，但我现在仍然坚持下面。

我从先前存储在数组中获得了链接。

然后，我试图访问存储在名为StartupLink的列表中的所有链接。我需要抓取并存储在数组中的信息位于div class=content标记中。对于某些链接，上述div标记包含div hidden_more，其中包含启用了javascript的点击事件。所以我正在处理异常。但是，循环运行正常并且访问链接但是在前两个链接之后它提供NA输出，即使存在div content标记，它也没有显示错误（这是不可接受的）。

该数组包含400个要访问的链接，其中包含类似的div content元素。我在哪里错了？

Description=[]
driver = webdriver.Chrome()

for link in StartupLink:

    try:
        driver.get(link)
        sleep(5)
        more = driver.find_element_by_xpath('//a[@class="hidden_more"]')
        element = WebDriverWait(driver, 10).until(EC.visibility_of(more))
        sleep(5)
        element.click()
        sleep(5)
        page = driver.find_element_by_xpath('//div[@class="content"]').text
        sleep(5)
    except Exception as e:# NoSuchElementException:
        driver.start_session()
        sleep(5)
        page = driver.find_element_by_xpath('//div[@class="content"]').text
        sleep(5)
        print(str(e))
    if page == '':
        page = "NA"
        Description.append(page)
    else:
        Description.append(page)
    print(page)

迭代存储在数组中的多个链接时，无法使用selenium Web驱动程序获取文本

0 个答案: