使用硒获取非文本值

时间:2015-07-27 14:37:17

标签: python selenium web-scraping

我正在尝试从https://www.facebook.com/public/nitin-solanki页面获取一些数据。我可以得到除

之外的所有价值观

Studied at Lives in From

这三个标签。我可以使用

获得此标签的价值
driver.get("https://www.facebook.com/public/nitin-solanki")         
wait = WebDriverWait(driver, 10)
wait.until(EC.visibility_of_element_located((By.CLASS_NAME, "mbm")))
    for s in driver.find_elements_by_css_selector('.mbm.detailedsearch_result'):
        result = {}
        v =  s.find_element_by_css_selector('.fsm.fwn.fcg')
        x = v.find_elements_by_class_name('fbProfileBylineLabel')
        for y in x:
        #print y.text #this should give me label like lives in, studied at but does not
            z = y.find_elements_by_tag_name('a')
            for a in z:
                print a.text #I want to get label for this value along with it

我想要做的是创建词典

{'Studied_at' : 'Gujarat University', 'Lives_in': 'Ahmedabad, India', 'From' : 'Ahmedabad, India'}

这三个值。

1 个答案:

答案 0 :(得分:1)

您可以使用Javascript Childnodes属性

获取元素的nodevalue
//try in browser console
    document.getElementsByClassName("fbProfileBylineLabel")[0].childNodes[0].nodeValue;//Studied at

    document.getElementsByClassName("fbProfileBylineLabel")[1].childNodes[0].nodeValue;//Lives in

    document.getElementsByClassName("fbProfileBylineLabel")[2].childNodes[0].nodeValue;//From

<强>伪代码

v =  s.find_element_by_css_selector('.fsm.fwn.fcg')
        x = v.find_elements_by_class_name('fbProfileBylineLabel')
        for y in x:
            //Example in java(sry not to familiar with python)
           JavascriptExecutor js = (JavascriptExecutor) driver;
         String s= (String)js.executeScript("return arguments[0].childNodes[0].nodeValue;",x);

            z = y.find_elements_by_tag_name('a')

希望这会对你有所帮助。如果你有任何疑问,请回复