无法从隐藏的容器中刮出一些文本

时间:2019-02-19 08:55:57

标签: python python-3.x selenium selenium-webdriver web-scraping

我已经用python编写了一个脚本,以抓取位于right-column内的类 floorplan 内的某个文本,该文本又位于modal-body内。但是,当我运行脚本时,会显示空白输出吗?

link to that site

点击前的元素(floorplanswing类中的值为null)

<div class="right-column">
    <div class="field" ng-show="selectedLot.Name !== ''">
        <div class="label">Home Design:</div>
        <div class="floorplan value ng-binding"></div>
        <hr>
    </div>
    <div class="field" ng-show="selectedLot.ShortDescription !== ''">
        <div class="label">Elevation:</div>
        <div class="swing value ng-binding"></div>
        <hr>
    </div>
    <div class="field" ng-show="selectedLot.Swing !== ''">
        <div class="label">Swing:</div>
        <div class="swing value ng-binding"></div>
        <hr>
    </div>
</div>

点击后(floorplanswing类中现在有值):

<div class="right-column">
    <div class="field" ng-show="selectedLot.Name !== ''">
        <div class="label">Home Design:</div>
        <div class="floorplan value ng-binding">Delaware</div>
        <hr>
    </div>
    <div class="field" ng-show="selectedLot.ShortDescription !== ''">
        <div class="label">Elevation:</div>
        <div class="swing value ng-binding">TRA</div>
        <hr>
    </div>
    <div class="field" ng-show="selectedLot.Swing !== ''">
        <div class="label">Swing:</div>
        <div class="swing value ng-binding">Garage Right</div>
        <hr>
    </div>
</div>

到目前为止,我已经尝试过使用(can't make my script click on that image to reveal the data I'm after):

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

def collect_links(link):
    driver.get(link)
    wait.until(EC.invisibility_of_element_located((By.CSS_SELECTOR,"path#ip-loader-circle")))
    item = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR,".modal-body .right-column .floorplan")))
    print(item.get_attribute("innerHTML"))

if __name__ == '__main__':
    url = "https://khovsecure.ml3ds-cloud.com/index.html?_ga=2.181197287.1174152084.1550480313-902396065.1550480313#/lotmap/43935"
    driver = webdriver.Chrome()
    wait = WebDriverWait(driver,20)
    collect_links(url)
    driver.quit()

预期输出:

Delaware

这是在该地图上单击时在框中弹出信息的方式:

enter image description here

  

如何在该地图上单击以从弹出式容器中抓取所需的文本?

1 个答案:

答案 0 :(得分:1)

下面的代码为您提供json格式的所有数据:

import requests

if __name__ == '__main__':
    headers = {
        'fullurl': 'https://khovsecure.ml3ds-cloud.com/index.html?_ga=2.181197287.1174152084.1550480313-902396065.1550480313#/lotmap/43935',
    }
    response = requests.get('https://khovsecure.ml3ds-cloud.com/resources/data/CommunityData/khovsecure.ml3ds-cloud.com', headers=headers)
    print(response.json())