如何在beautifulsoup中找到基于文本忽略子标签的元素

时间:2018-05-22 03:06:56

标签: python python-3.x beautifulsoup

我正在寻找使用Python和BeautifulSoup的解决方案,以根据内部文本查找元素。例如:

<div> <b>Ignore this text</b>Find based on this text </div>

我怎样才能找到这个div?谢谢你的帮助!

2 个答案:

答案 0 :(得分:4)

您可以将var dateObj = new Date(new Date().setHours(0, 0, 0, 0)); console.log(dateObj);.find参数一起使用,然后将text用于父元素。

<强>实施例

findParent

<强>输出:

from bs4 import BeautifulSoup
s="""<div> <b>Ignore this text</b>Find based on this text </div>"""
soup = BeautifulSoup(s, 'html.parser')
t = soup.find(text="Find based on this text ") 
print(t.findParent())

答案 1 :(得分:0)

尝试一下,它就像是一个例子但它有效

public function Asset($param = "")
{
    $conn = array( "Database"=>"United2",
                        "UID"=>"it", 
                        "PWD"=>"uni");
    $import = sqlsrv_connect("10.11.1.6", $conn);

    $cols = array("Kode","Description","Keterangan");
    $idx = "Kode";
    $tbl = ("(
                SELECT Kode, Description, Keterangan FROM Ms_OtherAsset
            ) src    
            ", $import);
    $whr = "";
    $data = $this->GetDataMsSQL($cols,$idx,$tbl,$whr);
    $output = $data["output"];
    $datares = $data["datares"];
    if(!empty($datares))
    {
        foreach($datares->result_array() as $row)
        {
            $r = array();
            foreach($cols as $c)
            {    
                $r[] = $row[$c];
            }
            $output["aaData"][] = $r;
        }
    }
    echo json_encode($output);
}

<强>输出

from bs4 import BeautifulSoup
html="""
<div> <b>Ignore this text</b>Find based on this text </div>
"""

soup = BeautifulSoup(html, 'lxml')                                                                                                                                                

s = soup.find('div')

for child in s.find_all('b'):
    child.decompose()

print(s.get_text())