如何为网络爬虫设置“试用”时间段?

时间:2019-03-14 21:36:31

标签: selenium

我有一个使用硒的刮刀。

我有数十万个链接,我的网络抓取工具将设置为打开这些链接并从中提取某些数据。但是,在某些链接上没有数据。在这些情况下,我的网络爬虫正在尝试很长时间才能找到数据,然后放弃并移至下一个。 我希望能够缩短它搜索到下一个迭代之前的时间。 到目前为止,这是我的代码。

for i in links:
try:

    driver.get(i)
    locater = ('//tr[@data-bid="18"]'+'//span[@class="table-main__detail-odds--hasarchive"]')
    pin = driver.find_elements_by_xpath(locater)
    match = driver.find_elements_by_xpath('//span[@class="list-breadcrumb__item__in"]')[0].text
    date = driver.find_elements_by_xpath('//p[@class="list-details__item__date"]')[0].text
    score = driver.find_elements_by_xpath('//p[@class="list-details__item__score"]')[0].text
except:
    pass


    for i in pin:
        try:
            i.click()
            time.sleep(3)
            f = driver.find_elements_by_xpath('//td[@class="bold"]')
            d = driver.find_elements_by_xpath('//td[@class="date"]')
            with open("t14.csv","a") as r:
                r.write("\n")
                r.write(match + "," + date + "," + score + ",")
            for i in d:
                b = i.text
                for i in f:
                    a = i.text
                    with open("t14.csv","a") as r:
                        r.write(a + "," + b + ",")

        except:
            pass

0 个答案:

没有答案