如何在Hive XPath中忽略DTD?

时间:2015-07-02 15:05:31

标签: hadoop xpath hive

如何确保Hive忽略DTD?

以下在Hive中有效:

select xpath_string('<a><b>bb</b><c>cc</c></a>', 'a/b') from src limit 1;

但是这个查询没有(添加DTD):

select xpath_string('<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd"><a><b>bb</b><c>cc</c></a>', 'a/b') from src limit 1;

哪个失败并出现此错误:

Error occurred executing hive query: Error while compiling statement: FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments ''a/b'': org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text org.apache.hadoop.hive.ql.udf.xml.UDFXPathString.evaluate(java.lang.String,java.lang.String) on object org.apache.hadoop.hive.ql.udf.xml.UDFXPathString@70b25313 of class org.apache.hadoop.hive.ql.udf.xml.UDFXPathString with arguments {<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd"><a><b>bb</b><c>cc</c></a>:java.lang.String, a/b:java.lang.String} of size 2

0 个答案:

没有答案