我使用以下python代码从文本框中提取文本
def check():
with open("LP_input.txt") as f:
for line in f:
url = line.strip()
driver.get(url)
driver.wait = WebDriverWait(driver, 10)
time.sleep(10)
PC = driver.find_elements_by_xpath("//div[@id='wwctrl_landingPageDataForm_attributeMap_STRUCTURE_DATA_REQUIRED']")
for x in PC:
print(x)
我的HTML(我提取文字的网页)
<div id="wwctrl_landingPageDataForm_attributeMap_STRUCTURE_DATA_REQUIRED" class="wwctrl">
<input id="landingPageDataForm_attributeMap_STRUCTURE_DATA_REQUIRED" class="text medium" name="attributeMap.STRUCTURE_DATA_REQUIRED" maxlength="1000" value="TRUE" style="" type="text"
但是我收到了这个错误;
<selenium.webdriver.remote.webelement.WebElement (session="9f5789eaeb8dbd5cc005dc63e3d4f9f2", element="0.6714808439487934-1")>
实际上文本框将包含TRUE或FALSE ..我想提取数千页。
答案 0 :(得分:0)
将DoesNotContain[T]
视为id
和landingPageDataForm_attributeMap_STRUCTURE_DATA_REQUIRED
视为name
,似乎两个属性都是动态生成的。因此,我们需要构建动态attributeMap.STRUCTURE_DATA_REQUIRED
或xpath
以首先获取所有css
。我们需要将WebElements
存储在WebElements
中,然后遍历List
List
以检索WebElements
字段的值value
或TRUE
如下:
FALSE
答案 1 :(得分:0)
我添加了额外的输入,将代码更改为:
def check():
with open("LP_input.txt") as f:
for line in f:
url = line.strip()
driver.get(url)
driver.wait = WebDriverWait(driver, 10)
time.sleep(10)
PC = driver.find_elements_by_xpath("//div[@id='wwctrl_landingPageDataForm_attributeMap_STRUCTURE_DATA_REQUIRED']")
for x in PC:
print(x.text())
#Extra Input, to prevent the script from closing.
input("Press any key to exit!!")