将JSON文件转换为AVRO时出错

时间:2015-02-27 21:50:28

标签: hadoop serialization mapreduce avro

我按照AVRO网站上的说明创建了一个json和一个模式文件,如下所示(两者都在文本文件中):

JSON文件

{"name": "user", "favorite_number": null, "favorite_color": "red"}
{"name": "user", "favorite_number": null, "favorite_color": "green"}
{"name": "user", "favorite_number": null, "favorite_color": "purple"}
{"name": "user", "favorite_number": null, "favorite_color": null}

和架构文件:

{"namespace": "example.avro",
"type": "record",
"name": "User",
"fields": [
{"name": "name", "type": "string"},
{"name": "favorite_number",  "type": ["int", "null"]},
{"name":"favorite_color", "type": ["string", "null"]}
]
}

当我尝试使用avro-tools jar文件创建avro文件时,收到以下错误消息:

Exception in thread "main" org.apache.avro.AvroTypeException: Expected start-uni
on. Got VALUE_STRING
    at org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:697)
    at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:441)
    at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:29
0)
    at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
    at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:2
67)
    at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.ja
va:155)
    at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumRead
er.java:193)
    at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumRea
der.java:183)
    at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.ja
va:151)
    at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.ja
va:142)
    at org.apache.avro.tool.DataFileWriteTool.run(DataFileWriteTool.java:99)

    at org.apache.avro.tool.Main.run(Main.java:84)
    at org.apache.avro.tool.Main.main(Main.java:73)

有人可以帮我解决这个问题。我做错了什么?

1 个答案:

答案 0 :(得分:0)

更正JSON输入的前三行,如下所示,然后尝试。

{"name": "user", "favorite_number": null, "favorite_color":{"string": "red"}}
{"name": "user", "favorite_number": null, "favorite_color":{"string": "green"}}
{"name": "user", "favorite_number": null, "favorite_color":{"string":"purple"}}