我正在从互联网上读取一个html文件,当我读取文件时,我的控制台的输出如下:
<string>
<String1>
text
</String1>
<level2>
text2
</level2>
<level3>
text3
</level3>
<level4>
text4
</level4>
<level5>
TEXT
</level5>
</string>
<string>
<String2>
text
</String2>
<level2>
text2
</level2>
<level3>
text3
</level3>
<level4>
text4
</level4>
<level5>
THIS TEXT
</level5>
</string>
如何访问第二个字符串中的level5文本?我一整天都在努力,没有运气,非常感谢那些了解更多相关信息的人的一些意见。
这是我的代码:
String line = null;
try {
// FileReader reads text files in the default encoding.
FileReader fileReader = new FileReader(String.valueOf(doc));
// Always wrap FileReader in BufferedReader.
BufferedReader bufferedReader = new BufferedReader(fileReader);
while ((line = bufferedReader.readLine()) != null) {
Elements tdElements = doc.getElementsByTag("level1");
for(Element element : tdElements )
{
//Print the value of the element
System.out.println(element.text());
}
}
// Always close files.
bufferedReader.close();
} catch (FileNotFoundException ex) {
System.out.println(
"Unable to open file '" +
doc + "'");
} catch (IOException ex) {
System.out.println(
"Error reading file '"
+ doc + "'");
// Or we could just do this:
// ex.printStackTrace();
}
}
//
catch (IOException e) {
e.printStackTrace();
}
答案 0 :(得分:1)
下面的代码使用JSoup来解析您所引用的文本。变量'textToParse'是您提供的上述html代码。您可以使用JSoup的Psuedo选择器来查找DOM树中特定位置的元素。希望这是你想要的。
Document document = Jsoup.parse(textToParse);
Elements stringTags = document.select("string:eq(1)");
for(Element e : stringTags) {
System.out.println(e.select("level5").text());
}
//Output: THIS TEXT
答案 1 :(得分:1)
您可以在此处使用CSS选择器:
string:nth-of-type(2) > level5
DEMO:http://try.jsoup.org/~8w_pfCxDhJwIseTKiKsQjQJOBRs
string:nth-of-type(2) /* Select the 2nd string node in document... */
> level5 /* ... then select all "level5" child nodes */
Document doc = ...
Element level5Node = doc.select("string:nth-of-type(2) > level5").first();
if (level5Node ==null) {
throw new RuntimeException("Unable to locate level5 text...");
}
System.out.println(level5Node.text()); // THIS TEXT
答案 2 :(得分:0)
解决方案1:您的HTML是有效的XML:使用XML工具:
你可以使用XPath获得第二级别5:“// string [2] / level5”
解决方案2:使用Jsoup解析它并获取文档 然后使用Xpath作为解决方案1
使用XPath / XSoup查看Jsoup:Does jsoup support xpath?
解决方案1:
String xml="<root>"+your xml+"</root>";
DocumentBuilderFactory builderFactory =DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(xml)));
XPath xPath = XPathFactory.newInstance().newXPath();
String expression="//string[2]/level5";
String value = xPath.evaluate(expression, document);
System.out.println("EVALUATE:"+value);