Question

我使用jsoup从xml doc中提取信息，如下所示：

<results>
  <status>OK</status>
  <totalTransactions>1</totalTransactions>
  <language>english</language>
  <taxonomy>
    <element>
      <label>/business and industrial/advertising and marketing/telemarketing</label>
      <score>0.805156</score>
    </element>
    <element>
      <confident>no</confident>
      <label>/automotive and vehicles/certified pre-owned</label>
      <score>0.23886</score>
    </element>
    <element>
      <confident>no</confident>
      <label>/shopping/retail</label>
      <score>0.156721</score>
    </element>
  </taxonomy>
</results>

我想从xml中得到的是标签和分数标签中的文字。因此：

Document doc = Jsoup.parse(job[1], "", Parser.xmlParser());

String status = doc.select("status").text();

if (status.equals("OK")) {
    Elements elements = doc.getElementsByTag("element");

    for (Element e : elements) {
            System.out.println(e.select("label").text() + ","
                        + e.select("score").text());
    }
}

程序只读取状态标记...之后没有文本返回...

感谢您的帮助。

Answer 1

String status = doc.select("status").text();

如果您的文档包含多个状态元素，则会失败。更明确地使用第一个：

String status = doc.select("status").first().text();
//                                   ^^^^^^^

同样适用于其他选择。方法select()始终返回Elements个对象（= Element列表） - 因此text()会获取所有找到的元素的文本。

您的代码没问题，使用显示的xml代替job[1]返回：

/business and industrial/advertising and marketing/telemarketing,0.805156
/automotive and vehicles/certified pre-owned,0.23886
/shopping/retail,0.156721

您是否检查过您收到错误的Xml？

使用jsoup进行XML内容提取

1 个答案: