如何使用selenium和python获取标签中的符号文本?

时间:2015-05-08 12:45:28

标签: python selenium selenium-webdriver web-scraping

我希望每个用户评分为this link

data-iconr =“ù”这会产生评分 4.0或其他任何

你可以查看div标签

<div class="left bold zdhl2 tooltip icon-font-level-7" data-iconr="ù">Rated</div>

有没有办法获得每个用户的评分?

另外,我如何获得div标签的xpath,因为它改变了每个div的类?

1 个答案:

答案 0 :(得分:0)

它可能不是防弹的,但是,您可以依赖元素的类名,它会随着评级而变化 - 例如对于5.0,有icon-font-level-9类,适用于4.5 - icon-font-level-8等。

实现:

import re
from selenium import webdriver


driver = webdriver.Firefox()
driver.get("https://www.zomato.com/ncr/salad-days-dlf-cyber-city-gurgaon")

mapping = {
    "icon-font-level-9": 5.0,
    "icon-font-level-8": 4.5,
    "icon-font-level-6": 3.5
    # ... TODO
}

pattern = re.compile(r"icon-font-level-\d+")
for review in driver.find_elements_by_css_selector("div[itemprop=review]"):
    author = review.find_element_by_css_selector("div[itemprop=author] div[itemprop=name] a").text

    rating_class = review.find_element_by_xpath(".//div[. = 'Rated']").get_attribute('class')
    rating = mapping.get(pattern.search(rating_class).group(0))
    print author, rating

打印:

Vandhna Babu 4.5
Mohit Yadav 3.5
Pulkit1283 4.5
Grub Society 5.0
Joel George 5.0