我正在尝试在EMR上创建一个配置单元表,以读取复杂的嵌套json。我使用的是AWS提供的可用SerDe jar,已将其复制到本地/usr/lib/hive/lib/
处。
SerDe
s3://elasticmapreduce/samples/hive-ads/libs/jsonserde.jar
配置单元版本:
Hive 2.3.2-amzn-2
Hadoop版本:
Hadoop 2.8.3-amzn-0
DDL:
add jar /usr/lib/hive/lib/jsonserde.jar;
CREATE external TABLE complex_json (
docid string,
`user` struct<
id:INT,
username:string,
name:string,
shippingaddress:struct<
address1:string,
address2:string,
city:string,
state:string
>,
orders:array<
struct<
itemid:INT,
orderdate:string
>
>
>
)
ROW FORMAT SERDE 'com.amazon.elasticmapreduce.JsonSerde'
LOCATION 's3://bucket/json/'
JSON:
{
"DocId": "AWS",
"User": {
"Id": 1234,
"Username": "bob1234",
"Name": "Bob",
"ShippingAddress": {
"Address1": "123 Main St.",
"Address2": null,
"City": "Seattle",
"State": "WA"
},
"Orders": [
{
"ItemId": 6789,
"OrderDate": "11/11/2017"
},
{
"ItemId": 4352,
"OrderDate": "12/12/2017"
}
]
}
}
我收到以下错误-
使用文件中的配置初始化日志:/etc/hive/conf.dist/hive-.log4j2.properties异步:false 在类路径中添加了[/usr/lib/hive/lib/jsonserde.jar] 添加的资源:[/usr/lib/hive/lib/jsonserde.jar]
失败:执行错误,从> org.apache.hadoop.hive.ql.exec.DDLTask返回代码1。 org / apache / hadoop / hive / serde2 / SerDe
请注意:我也尝试过Openx-JsonSerDe,但收到相同的错误。任何帮助表示赞赏。 TIA!