硒或漂亮汤从动态表格单元获取数据

时间:2019-10-19 11:26:12

标签: python selenium beautifulsoup

我正在尝试获取Web表中字段的文本,数据字段位于表单元格及其动态内部。我正在使用python脚本执行此任务。 我已经在下面尝试过了

<div style="text-align: center;">
  <a class="primary" id="about" onclick="show();">text 1 </a>
  <a class="secondary" id="email">more text</a>
</div>

HTML:

1- get attributes using x path , innerText , innerHTML, textContent, value, resulting None or Html.
2 - beautiful soup  - Returning HTML and with lxml returning none

美丽的汤

<td>
<div class="fieldsbox" id="xfe54" style="visibility: visible;">
<input readonly="" isoutputcontrol="true" xformstype="output" id="policy_number"
 xql="tns:CHDRNUM" databoundelement="true" __parent="tblResults" class="input output" 
absolutexpath="tns:CHDRNUM" doebivalidate="false" title="Value for Policy No." style="" 
ref="tns:CHDRNUM" _intable="true"></div>
</td>

通过XPATH

url = "https://cms.bharti-axagi.co.in/home/CMS/com/bagi/cms/Loginforms/CMS_LoginScreen.caf"
    crom_driver.get(url)
    time.sleep(5)
    content = crom_driver.page_source
    soup = bs(content, "html.parser")
    data = soup.findAll("table", {"id": "CMS_CLAIMS_DETAILSTable"})

    print(data)

1 个答案:

答案 0 :(得分:0)

我尝试了完整的x路径,并且可以通过get_attibute('value')解决问题

xpath_string =''/ html / body / div [2] / div [3] / div / div [2] / fieldset / div / div / div [1] / div / div [2] / div / div / table / tbody / tr [2] / td [2] / div / input''