Question

我是python新用户，我想从以下网站抓取数据：https://www.telerad.be/Html5Viewer/index.html?viewer=telerad_fr

我的问题是数据是动态生成的。我读到的修复可能性很小，但没有一个令人满意。对于硒，我需要一个名称或Xpath才能单击按钮，但是这里什么也没有。

import requests
from lxml import html

page = requests.get('https://www.telerad.be/Html5Viewer/index.html?viewer=telerad_fr')
tree = html.fromstring(page.content)

cities = tree.xpath('//*[@id="map-container"]/div[6]/div[2]/div/div[2]/div/div/div[1]/div/p[1]/text()[2]')


print('Cities: ', cities)

Answer 1

实际上有一个xpath可以单击按钮：

//*[@id='0_layer']/*[@fill]

在这里，尝试一下（硒）：

dotList = driver.find_elements_by_xpath("//*[@id='0_layer']/*[@fill]")
for dot in dotList:
    dot.click()
    cities = driver.find_element_by_xpath("//div[@data-region-name='NavigationMapRegion']//p[1]")
    print("Cities: ", cities.text)
    closeBtn = driver.find_element_by_xpath("//*[@class='panel-header-button right close-16']")
    closeBtn.click(); #the modal can intercept clicks on some dots, thats why we close it here after extracting the info we need.

此代码单击（或至少尝试在没有发生StaleElementExceptions的情况下尝试）地图上的所有橙色点，并打印“城市”内容（基于Xpath）。

如果有人在代码中发现错误，请编辑此答案，我是在记事本++上编写的。

如何在地图上抓取js生成的活动数据

1 个答案: