迭代网站表行并获取数据

时间:2017-07-03 00:09:30

标签: python selenium for-loop

如何迭代并抓取此网站中的每个现有行:https://icostats.com/

是否可以使用类似下面的代码来实现?

rows = []
for row in rows(0, 20):
    row += 1
    get_css_sel("#app > div > div.container-0-16 > div.table-0-20 > div.tbody-0-21 > div:nth-child({})").format(row)

完整代码:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait as wait

def get_css_sel(selector):
    posts = browser.find_elements_by_css_selector(selector)
    for post in posts:
        print(post.text)

browser = webdriver.Chrome(executable_path=r'C:\Users\alph1\Scrapers\chromedriver.exe')
browser.get("https://icostats.com")
wait(browser, 40).until(EC.presence_of_element_located((By.CSS_SELECTOR, "#app > div > div.container-0-16 > div.table-0-20 > div.tbody-0-21 > div:nth-child(2) > div:nth-child(8)")))

browser.execute_script('''
    var element = document.getElementsByClassName("buyNow-0-81"), index;
    for (index = element.length - 1; index >= 0; index--) {
    element[index].parentNode.removeChild(element[index]);
    }
''')

get_css_sel("#app > div > div.container-0-16 > div.table-0-20 > div.tableheader-0-50")              #fetch header of table

rows = []
for row in rows(0, 20):
    row += 1
    get_css_sel("#app > div > div.container-0-16 > div.table-0-20 > div.tbody-0-21 > div:nth-child({})").format(row)

1 个答案:

答案 0 :(得分:1)

忘记循环,只需:

get_css_sel("#app > div > div.container-0-16 > div.table-0-20 > div.tbody-0-21 > div")