HIVE XML SerDe:异常java.io.IOException失败:java.lang.NullPointerException

时间:2017-07-18 06:20:37

标签: xml hive hive-serde

我使用XML SerDe从XML文件创建一个带有HIVE(Hive 2.1.1-mapr-1703)的外部表。该文件是W3C联盟的XML example

这是我创建表格的代码:

add jar /mapr/localpath/hivexmlserde-1.0.5.3.jar;
USE my_db;
CREATE EXTERNAL TABLE frank_books (
category STRING,
title STRING,
language STRING,
year BIGINT
)
ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
WITH SERDEPROPERTIES (
"column.xpath.category" = "/book/@category",
"column.xpath.title"    = "/book/title/text()",
"column.xpath.language" = "/book/title/@lang",
"column.xpath.year"     = "/book/year/text()"
)
STORED AS
INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat' 
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION '/mapr/localpath/database_files/xml_example'
TBLPROPERTIES (
"xmlinput.start" = "<book category",
"xmlinput.stop" = "</book>"
)

表本身存在,因为describe语句不会导致错误:

describe frank_books;

如下所示的简单选择语句会导致 NullPointerException

select * from my_db.frank_books;

这是输出:

OK
Failed with exception java.io.IOException:java.lang.NullPointerException
Time taken: 1.117 seconds

任何人都可以帮忙,请向我解释错误吗?

谢谢,弗兰克

1 个答案:

答案 0 :(得分:0)

可能是MapR特有的吗?

hive> DROP TABLE IF EXISTS xml_45158949;
OK
Time taken: 0.977 seconds
hive> 
    > CREATE  TABLE xml_45158949(
    > category STRING,
    > title STRING,
    > language STRING,
    > year BIGINT
    > )
    > ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
    > WITH SERDEPROPERTIES(
    > "column.xpath.category" = "/book/@category",
    > "column.xpath.title"    = "/book/title/text()",
    > "column.xpath.language" = "/book/title/@lang",
    > "column.xpath.year"     = "/book/year/text()"
    >   )
    > STORED AS 
    > INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat' 
    > OUTPUTFORMAT    'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat' 
    > TBLPROPERTIES (
    > "xmlinput.start"="<book category",
    > "xmlinput.end"="</book>"
    > );
 OK
 Time taken: 0.243 seconds
 hive> 
  > load data local inpath '/Users/dvasilen/Misc/XML/45158949.xml'        OVERWRITE into table xml_45158949;
 Loading data to table default.xml_45158949
 OK
 Time taken: 0.153 seconds
 hive> 
  > select * from xml_45158949;
  OK
 cooking     Everyday Italian   en  2005
 children   Harry Potter    en  2005
 web     XQuery Kick Start  en  2003
 web     Learning XML   en  2003
 Time taken: 0.08 seconds, Fetched: 4 row(s)
 hive> 

似乎适合我。