Question

我正在尝试访问一个url，从中获取html并使用xpaths从中获取某些值。我得到的HTML很好，Jtidy似乎正在适当地清理它。但是，当我尝试使用xpaths获取所需的值时，我得到一个空的NodeList。我知道我的xpath表达式是正确的;我已经用其他方式对它进行了测试。这段代码有什么问题。谢谢你的帮助。

String url_string = base_url + countries[c];
URL url = new URL(url_string);

Tidy tidy = new Tidy();
tidy.setShowWarnings(false);
tidy.setXHTML(true);
tidy.setMakeClean(true);
Document doc = tidy.parseDOM(url.openStream(), null);
//tidy.pprint(doc, System.out);

String xpath_string = "id('catlisting')//a";
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile(xpath_string);

NodeList nodes = (NodeList)expr.evaluate(doc, XPathConstants.NODESET);
System.out.println("size="+nodes.getLength());
for (int r=0; r<nodes.getLength(); r++) {
    System.out.println(nodes.item(r).getNodeValue()); 
}

Answer 1

尝试“// div [@ id ='catlisting'] //一个”

xpath在java中不起作用

1 个答案: