Question

我正在尝试使用JSoup解析类似于以下内容的结构。

<div class="bigClass">
    <a href="foo.com"> Field 1</a>
    <a href="bar.com"> Field 2</a>
    <a href="baz.com"> Field 3</a>
</div>

现在，我正在使用以下代码来获取 div类“bigClass”的整个文本内容

doc = Jsoup.connect("http://foobar.com").userAgent(userAgent).timeout(1000).get();
price = doc.getElementsByClass("bigClass");
System.out.println(price.text());

我怎样才能获得第一个孩子（“字段1”），无论<a>类和URL是什么？

类似问题的BeautifulSoup python：Beautiful soup getting the first child

Answer 1

我可能正在寻找

doc.getElementsByClass("bigClass").first().child(0)

getElementsByClass("bigClass")返回bigClass
但我们希望获得具体的（可能是第一个）
并在第一个元素上选择其第一个子节点（子节点从0开始编制索引）。

Answer 2

或者，您可以使用以下两个选项之一：

选项1

doc.select("div.bigClass > a:first-of-type");

DEMO：http://try.jsoup.org/~btbp8Fb1xrPf38dTYbplLz5lA3Y

选项2

doc.select("div.bigClass > a:first-child");

DEMO：http://try.jsoup.org/~mj8CAaWTtQEicyd75bSHDV3_KeA

JSoup得到div的第一个孩子

2 个答案:

选项1

选项2