我有一个代码如下所示的网站:
<div class="d-row js-wrapper" id="row-1"><div class="d-cell js-activator" data-label="type">Residential</div><div class="d-cell d-cell--break" data-label="Company">J Smith</div><div class="d-cell js-target" data-label="Location">UK</div><div class="d-cell js-target" data-label="ID">62144</div><div class="d-cell js-target" data-label="Ask
">730000</div><div class="d-cell js-target" data-label="email">None</div><div class="d-cell js-target" data-label="Contact time (GMT)">
8:00 am to 4:30 pm
</div> </div>
<div class="d-row js-wrapper" id="row-2"><div class="d-cell js-activator" data-label="type">Commercial</div><div class="d-cell d-cell--break" data-label="Company">JBloggs ltd</div><div class="d-cell js-target" data-label="Location">FR</div><div class="d-cell js-target" data-label="ID">55324</div><div class="d-cell js-target" data-label="Ask
">670000</div><div class="d-cell js-target" data-label="email">None</div><div class="d-cell js-target" data-label="Contact time (GMT)">
9:00 am to 5:30 pm
</div> </div>
我希望能够将它刮到熊猫数据框中。到目前为止,我已经在 selenium 中尝试了以下内容:
info = driver.find_element_by_class_name(".d-row")
print(info[0].text)
但这给了这个:
Residential J Smith UK 62144 730000 None 8:00 am to 4:30 pm
有人可以帮忙吗?
谢谢!
答案 0 :(得分:1)
如何找到所有class包含d-cell的元素,然后获取属性data-label
list_elements = driver.find_elements_by_xpath('//div[contains(@class, "d-cell")]')
for element in list_elements:
print(element.get_attribute("data-label"))
答案 1 :(得分:1)
它缺少 s
它应该是 find_element[s]_by_class_name
,.-d-row
不是在该上下文中使用的有效值它应该用于 css 选择器并使用 get_attribute()
来获取元素属性
for row in driver.find_elements_by_css_selector(".d-row"):
for cell in row.find_elements_by_css_selector('.d-cell'):
key = cell.get_attribute('data-label').strip()
value = cell.text.strip()
print("{}: {}".format(key, value))