摆脱python + selenium脚本

时间:2017-08-06 06:47:42

标签: python python-3.x selenium selenium-webdriver web-scraping

我已经在python中编写了一个与selenium结合使用的脚本来解析网页中不同的曲目名称。我的脚本运行得很好。我在print语句中使用了try n break来获得所需的结果。是否有任何一个衬垫打印声明,以避免尝试中断表达?最后,我在我的脚本中使用了两次相同的选择器 - 一个用于查找位置,另一个用于选择相同的位置。我有可能摆脱这种冗余吗?

这是脚本:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome()
driver.get("https://www.shazam.com/charts/top-100/united-states")
wait = WebDriverWait(driver, 10)

wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "div.grid-vert-center"))) # This selector and the following one are identical so I wished to shake off the verbosity by using once
for item in driver.find_elements_by_css_selector("div.grid-vert-center"):
    try:
        track = item.find_element_by_css_selector('div.title a.ellip').text
    except Exception:
        track = ""
    print(track)  

driver.quit()

1 个答案:

答案 0 :(得分:1)

要减少可以替换的行数

wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "div.grid-vert-center"))) # This selector and the following one are identical so I wished to shake off the verbosity by using once
for item in driver.find_elements_by_css_selector("div.grid-vert-center"):

for item in wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "div.grid-vert-center"))):

as EC.presence_of_all_elements_located会返回一个可用于迭代的列表。

至于try / except ...您可以在下面使用

tracks = [track for track in item.find_elements_by_css_selector('div.title a.ellip')]
if tracks:
    print(tracks[0].text)

P.S。请注意,如果您的代码运行良好,但您正在搜索某些改进,则应使用Code Review代替StackOverflow