Question

我正在尝试从结构如下所示的页面中获取内容，但有些页面具有不同数量的段落和标题。所以下面的第四段后面有一个标题，有时这可能是在第二段之后，依此类推。如何在不指定精确div的情况下每次按顺序获取所有内容？我试过这个：

// * [@ id中=＆＃34; tab_info＆＃34; /ρ[16]

它确实有效，但是我无法在没有手动工作的情况下从标题在CSV上的Xpath代码中解决问题。我想我需要＆＃34;包含＆＃34;也许？这对我来说似乎不起作用：

// * [@ id中=＆＃34; tab_info＆＃34;] / P [1] [含有（。，强）]

<div id="tab_info" class="tab_content active">
                <h2>Information</h2>
    <p><strong>This Is The Main Title</strong></p>
        <p>This is a content div.</p>
        <p><strong>This is Subtitle 1</strong></p>
        <p>This is the second paragraph</p>
        <p>This is the third paragraph</p>
        <p>This is the fourth paragraph</p>
    <p><strong>This is Subtitle 2</strong></p>
        <p>This is the fifth paragraph.</p>
        <p>This is the sixth paragraph.</p>
        <p><strong>This is Subtitle 3</strong></p>
       <p>This is the seventh paragraph.</p>

Answer 1

如果你需要抓住那些有p孩子的strong元素，那么你可以试试

//div[@id="tab_info"]/p[strong]

Xpath如何通过指定类或名称来按顺序获取div

1 个答案: