Question

我对代码感兴趣，尝试了很多方法，但还没找到合适的答案。这是HTML代码：

import System.IO
import qualified Data.ByteString as B

main = do
            ... --some code that gets an array of ByteStrings called "files"
            hSetBuffering stdout LineBuffering
            putStrLn "Saving myFile1"
            B.writeFile "myFile1.bin" $ files!!0
            putStrLn "Saving myFile2"
            B.writeFile "myFile2.bin" $ files!!1
            putStrLn "Done!"

我正在使用Selenium，我必须从包含<div class="description"> <p>Text1</p> <p>Text1</p> <div class="excluding-class"> <ul> <li>list1</li> <li>list2</li> </ul> </div> </div>的HTML代码中提取一些数据。但是孩子<div class="description">让我遇到了问题所以我想通过致电<div id="excluding-class">或driver.get_element_by_class_name

来排除它

工作代码应该导出但不使用driver.get_element_by_xpath或其他内容：

<p>

有没有办法做到这一点？

Answer 1

仅使用XPath 1.0（selenium webdrivers中最常见的版本），无法获取内部HTML元素将包含指定子元素的父元素。但是，如果从DOM中删除子元素是可以的，则可以执行以下操作：

driver.execute_script("document.getElementsByClassName('excluded-class').remove()")
driver.get_element_by_class_name("description").get_attribute("innerHTML")

使用selenium或xPath Python排除父元素的子元素

1 个答案: