String body = "<br>";
Document document = Jsoup.parseBodyFragment(body);
document.outputSettings().escapeMode(EscapeMode.xhtml);
String str = document.body().html();
System.out.println(str);
期待:<br />
结果:<br>
Jsoup可以将值HTML转换为XHTML吗?
答案 0 :(得分:27)
请参阅Document.OutputSettings.Syntax.xml
:
private String toXHTML( String html ) {
final Document document = Jsoup.parse(html);
document.outputSettings().syntax(Document.OutputSettings.Syntax.xml);
return document.html();
}
答案 1 :(得分:7)
您应该告诉您希望将该字符串保留为HTML或XML格式。
public String parserXHtml(String html) {
org.jsoup.nodes.Document document = Jsoup.parseBodyFragment(html);
document.outputSettings().syntax(org.jsoup.nodes.Document.OutputSettings.Syntax.xml); //This will ensure the validity
document.outputSettings().charset("UTF-8");
return document.toString();
}
答案 2 :(得分:2)
您可以使用JTidy API执行此操作。使用jtidy-r938.jar
您可以使用以下方法从html获取xhtml
public static String getXHTMLFromHTML(String inputFile,
String outputFile) throws Exception {
File file = new File(inputFile);
FileOutputStream fos = null;
InputStream is = null;
try {
fos = new FileOutputStream(outputFile);
is = new FileInputStream(file);
Tidy tidy = new Tidy();
tidy.setXHTML(true);
tidy.parse(is, fos);
} catch (FileNotFoundException e) {
e.printStackTrace();
}finally{
if(fos != null){
try {
fos.close();
} catch (IOException e) {
fos = null;
}
fos = null;
}
if(is != null){
try {
is.close();
} catch (IOException e) {
is = null;
}
is = null;
}
}
return outputFile;
}