带有HTML文件的XMLUnit diff

时间:2014-07-07 23:50:51

标签: java html xml xhtml xmlunit

我正在尝试使用XMLUnit的diff和两个HTML文档。为此,我将它们转换为字符串,然后从两个字符串构造一个diff对象。

但是,这会抛出以下SAXException:

[Fatal Error] :1:177: The element type "br" must be terminated by the matching end-tag "</br>".
org.xml.sax.SAXParseException: The element type "br" must be terminated by the matching end-tag "</br>".
    at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
    at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
    at org.custommonkey.xmlunit.XMLUnit.buildDocument(XMLUnit.java:383)
    at org.custommonkey.xmlunit.XMLUnit.buildDocument(XMLUnit.java:370)
    at org.custommonkey.xmlunit.Diff.<init>(Diff.java:101)
    at org.custommonkey.xmlunit.Diff.<init>(Diff.java:93)
    at controllers.Api.diffUrls(Api.java:292)
    at Routes$$anonfun$routes$1$$anonfun$applyOrElse$8$$anonfun$apply$8.apply(routes_routing.scala:165)
    at Routes$$anonfun$routes$1$$anonfun$applyOrElse$8$$anonfun$apply$8.apply(routes_routing.scala:165)
    at play.core.Router$HandlerInvoker$$anon$7$$anon$2.invocation(Router.scala:183)
    at play.core.Router$Routes$$anon$1.invocation(Router.scala:377)
    at play.core.j.JavaAction$$anon$1.call(JavaAction.scala:56)
    at play.core.j.JavaAction$$anon$3.apply(JavaAction.scala:91)
    at play.core.j.JavaAction$$anon$3.apply(JavaAction.scala:90)
    at play.core.j.FPromiseHelper$$anonfun$flatMap$1.apply(FPromiseHelper.scala:82)
    at play.core.j.FPromiseHelper$$anonfun$flatMap$1.apply(FPromiseHelper.scala:82)
    at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:278)
    at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:274)
    at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:29)
    at play.core.j.HttpExecutionContext$$anon$2.run(HttpExecutionContext.scala:37)
    at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:42)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

因此,我需要将此HTML转换为有效的XML。我看到XMLUnit提供了HTMLDocumentBuilder,它似乎提供了一种方法来实现这一点,但仅用于XPath评估。什么是转换为有效HTML的简单方法,以便我可以执行XMLUnit diff?

1 个答案:

答案 0 :(得分:1)

您可以使用HTMLDocumentBuilder的{​​{1}}方法从HTML输入中创建parseDocument以及XMLUnit差异引擎的其他部分将很乐意与Diff一起使用。

请注意,Document有点像黑客,你可能更适合使用专门用于HTML =&gt;的库。像jTidy这样的XML转换。