我有一个xml,我正在使用xmlserde在蜂巢中插入值。在下面的xml中,我有一个属性 在AccountSetup和Accounts标签中
<AccountSetup xmlns:xsi="test">
<Accounts xmlns="http://acct.com/institutional">
<Account>
<Id>12346</Id>
<AcctNbr>AAAAAAAAAA</AcctNbr>
<RegTypeCd>XXXX</RegTypeCd>
<ClassCd>35</ClassCd>
</Account>
</Accounts>
</AccountSetup>
和我的创建表查询
CREATE TABLE suitability(Id STRING, AcctNbr STRING, RegTypeCd STRING)
row format serde 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
WITH SERDEPROPERTIES
("column.xpath.Id"="AccountSetup/Accounts/Account/Id/text()",
"column.xpath.AcctNbr"="AccountSetup/Accounts/Account/AcctNbr/text()",
"column.xpath.RegTypeCd"="AccountSetup/Accounts/Account/RegTypeCd/text()"
)stored as
inputformat 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
outputformat 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
tblproperties
("xmlinput.start"="<AccountSetup ", "xmlinput.end"="</AccountSetup>");
它在配置单元中显示空值
hive> select * from suitability;
OK
NULL NULL NULL
在另一个版本中,我从“帐户”标签中删除了属性 xmlns =“ http://acct.com/institutional
<AccountSetup xmlns:xsi="test">
<Accounts>
<Account>
<Id>12346</Id>
<AcctNbr>AAAAAAAAAA</AcctNbr>
<RegTypeCd>XXXX</RegTypeCd>
<ClassCd>35</ClassCd>
</Account>
</Accounts>
</AccountSetup
我使用了相同的创建表查询,它正在工作
output:
hive> select * from suitability;
OK
12346 AAAAAAAAAA XXXX
Time taken: 0.423 seconds, Fetched: 1 row(s)