我使用XML SerDe从XML文件创建一个带有HIVE(Hive 2.1.1-mapr-1703)的外部表。该文件是W3C联盟的XML example。
这是我创建表格的代码:
add jar /mapr/localpath/hivexmlserde-1.0.5.3.jar;
USE my_db;
CREATE EXTERNAL TABLE frank_books (
category STRING,
title STRING,
language STRING,
year BIGINT
)
ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
WITH SERDEPROPERTIES (
"column.xpath.category" = "/book/@category",
"column.xpath.title" = "/book/title/text()",
"column.xpath.language" = "/book/title/@lang",
"column.xpath.year" = "/book/year/text()"
)
STORED AS
INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION '/mapr/localpath/database_files/xml_example'
TBLPROPERTIES (
"xmlinput.start" = "<book category",
"xmlinput.stop" = "</book>"
)
表本身存在,因为describe语句不会导致错误:
describe frank_books;
如下所示的简单选择语句会导致 NullPointerException :
select * from my_db.frank_books;
这是输出:
OK
Failed with exception java.io.IOException:java.lang.NullPointerException
Time taken: 1.117 seconds
任何人都可以帮忙,请向我解释错误吗?
谢谢,弗兰克
答案 0 :(得分:0)
可能是MapR特有的吗?
hive> DROP TABLE IF EXISTS xml_45158949;
OK
Time taken: 0.977 seconds
hive>
> CREATE TABLE xml_45158949(
> category STRING,
> title STRING,
> language STRING,
> year BIGINT
> )
> ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
> WITH SERDEPROPERTIES(
> "column.xpath.category" = "/book/@category",
> "column.xpath.title" = "/book/title/text()",
> "column.xpath.language" = "/book/title/@lang",
> "column.xpath.year" = "/book/year/text()"
> )
> STORED AS
> INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
> TBLPROPERTIES (
> "xmlinput.start"="<book category",
> "xmlinput.end"="</book>"
> );
OK
Time taken: 0.243 seconds
hive>
> load data local inpath '/Users/dvasilen/Misc/XML/45158949.xml' OVERWRITE into table xml_45158949;
Loading data to table default.xml_45158949
OK
Time taken: 0.153 seconds
hive>
> select * from xml_45158949;
OK
cooking Everyday Italian en 2005
children Harry Potter en 2005
web XQuery Kick Start en 2003
web Learning XML en 2003
Time taken: 0.08 seconds, Fetched: 4 row(s)
hive>
似乎适合我。