我的同事在使用avro serde创建Hive表时遇到错误。下面是他尝试的代码并最终出错。
DROP TABLE IF EXISTS twitter;
CREATE TABLE twitter
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
TBLPROPERTIES ('avro.schema.literal'='{
"type" : "record",
"name" : "twitter_schema",
"namespace" : "com.miguno.avro",
"fields" : [ {
"name" : "username",
"type" : "string",
"doc" : "Name of the user account on Twitter.com"},
{"name" : "tweet",
"type" : "string",
"doc" : "The content of the user's Twitter message"},
{"name" : "timestamp",
"type" : "long",
"doc" : "Unix epoch time in seconds"}
],"doc:" : "A basic schema for storing Twitter messages"}');
我后来发现第17行的撇号“'”引起了这个问题。
"doc" : "The content of the user's Twitter message"},
在上面一行中,文档注释包含一个带撇号的单词。我删除它,它就像一个魅力。