Question

我是CSS新手并尝试通过Jsoup Parser for Java解析HTML。

示例HTML：

<p>However much beautiful the s6 Edge looks, I doubt [...] the <a title="Samsung Unveils the Galaxy Note 4 and curved screen Note Edge" href="http://www.example.com/">Note Edge</a>, the dual gently curved screen [...] or accidental palm taps.</p>

我已经在<p>元素中获取了如下文字：

Elements text = doc.select("p");

        for (Element element : text) {
            System.out.println(element.ownText() + "\n");
        }

输出：

然而s6 Edge看起来很漂亮，我怀疑是双重的轻轻弯曲的屏幕或意外的手掌水龙头。

可以看出，Note Edge元素未显示的文字<a>。

所以我想询问是否有可能显示整个文本，包括<a>元素中的文本如下：

无论s6 Edge看起来多么漂亮，我怀疑 Note Edge ，双轻微弯曲的屏幕或意外的手掌水龙头。

我很满意每一个建议！

Answer 1

根据docs，ownText()：

仅获取此元素拥有的文本; 无法获得所有孩子的合并文字。

如果您希望包含子节点的内容，则需要调用element.text()。

Answer 2

这样做：

for (Element element : text) {
  System.out.println(element.text() + "\n");
}

您应该使用text()代替ownText()，因为ownText() 不获取任何子元素的文字。

Answer 3

你可以做的是，而不是文本普通，然后是<a></a>标签，然后是更纯文本，你可以将文本包装起来，然后获取<p></p>元素的每个子元素的文本

<p id="myParagraph">
  <span>However much beautiful the s6 Edge looks, I doubt [...] the </span>
  <a title="Samsung Unveils the Galaxy Note 4 and curved screen Note Edge" href="http://www.example.com/">Note Edge</a>
  <span>, the dual
      gently curved screen [...] or accidental palm taps.</span>
</p>

因此，您的函数将遍历元素<p>

的子节点

   //I don't known jsoup so i use javascript directly
    var childrens= document.getElementByID("myParagraph").children;
        childrens.forEach(function(child) {
            console.log(child.textContent() + "\n");
        });

CSS选择器＆＃34;结合＆＃34;分子

3 个答案: