将xml数据导入配置单元但表中没有显示任何内容

时间:2015-06-10 17:47:07

标签: hive

我正在尝试将数据从xml文件导入到hive。

我在文件中的数据采用以下格式:

<review>
    <unique_id>0206cs23</unnique_id> 
    <product_name>Abcd</product_name>
    <product_type>abcd122</product_type>
    <rating>1</rating><title>ertn</title>
    <date>23/03/2012</date>
    <reviewer>mr. Abcd</reviewer>
    <reviewer_location>North Carolina,  USA</reviewer_location
    <review_text>I've always held the</review_text>
</review>

这就是我正在做的事情::

CREATE EXTERNAL TABLE RREVIEW (unique_id BIGINT, product_name string, product_type string, rating float, title string, review_date string, reviewer string, reviewer_location string, review_text string)
ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
WITH SERDEPROPERTIES (
"column.xpath.unique_id"="/review/unique_id/text()",
"column.xpath.product_name"="/review/product_name/text()",
"column.xpath.product_type"="/review/product_type/text()",
"column.xpath.rating"="/review/rating/text()",
"column.xpath.title"="/review/title/text()",
"column.xpath.review_date"="/review/date/text()",   
"column.xpath.reviewer"="/review/reviewer/text()",
"column.xpath.reviewer_location"="/review/reviewer_location/text()",
"column.xpath.review_text"="/review/review_text/text()"
)
STORED AS
INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
TBLPROPERTIES (
"xmlinput.start"="<unique_id> ",
"xmlinput.end"="</review>"
);

加载数据后(使用:LOAD DATA INPATH 'hdfs_file_or_directory_path' [OVERWRITE] INTO TABLE tablename) 它显示了这个消息:

Loading data to table default.rreview
 Table default.rreview stats: [numFiles=1, numRows=0,       totalSize=226671853, rawDataSize=0]
 OK
 Time taken: 0.373 seconds

但是当我查询表格时它没有显示任何内容......帮助。

0 个答案:

没有答案