我有以下代码:
(defn parse-schema
"Returns an Avro schema"
^Schema$RecordSchema [^String schema-file]
(let [schema (File. schema-file)]
(.parse (Schema$Parser.) schema-file)))
(defn get-reader
"Returns a DatumReader"
^SpecificDatumReader [^Schema$RecordSchema schema]
(SpecificDatumReader. schema))
(defn byte-to-object
"Returns an object from a byte[]"
[reader message]
(let [ decoder (.binaryDecoder (DecoderFactory/get) message nil) ]
(.read reader nil decoder)))
使用repl中的代码:
plugflow.main=> (avro/parse-schema "schema/test.avsc")
#object[org.apache.avro.Schema$RecordSchema 0x6e896dd7 "{\"type\":\"record\",\"name\":\"test\",\"namespace\":\"com.streambright.avro\",\"fields\":[{\"name\":\"user_name\",\"type\":\"string\",\"doc\":\"User name of any user\"}],\"doc:\":\"Nothing to see here...\"}"]
plugflow.main=> (def record-schema (avro/parse-schema "schema/test.avsc"))
#'plugflow.main/record-schema
plugflow.main=> (avro/get-reader record-schema)
#object[org.apache.avro.specific.SpecificDatumReader 0x56b1cac6 "org.apache.avro.specific.SpecificDatumReader@56b1cac6"]
plugflow.main=> (def avro-reader (avro/get-reader record-schema))
#'plugflow.main/avro-reader
plugflow.main=> (import '[java.nio.file Files Paths Path])
java.nio.file.Path
plugflow.main=> (import '[java.net URI])
java.net.URI
plugflow.main=> (def byte-arr (Files/readAllBytes (Paths/get (URI. "file:///data/test.avro"))))
#'plugflow.main/byte-arr
plugflow.main=> (avro/byte-to-object avro-reader byte-arr))
AvroRuntimeException Malformed data. Length is negative: -40 org.apache.avro.io.BinaryDecoder.doReadBytes (BinaryDecoder.java:336)
使用Avro CLI:
java -jar avro-tools-1.8.1.jar tojson data/test.avro
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
{"user_name":"tibi"}
我错过了什么?
答案 0 :(得分:1)
事实证明,有两套Avro类用于读取和编写Avro文件,另一套用于读取和编写Avro编码消息。如果使用avro-cli,它会编写一个包含其中架构的正确Avro文件。当我尝试使用专为处理Avro编码消息而设计的函数来读取文件时,它显然失败了。
在没有架构的情况下编写单个Avro消息的正确方法(如果您希望在单元测试或集成测试中使用它)
Schema schema = new Schema.Parser().parse("{\n \"type\": \"record\",\n \"name\": \"User\",\n \"namespace\": \"com.streambright\",\n \"fields\": [{\n \"name\": \"user_name\",\n \"type\": \"string\",\n \"doc\": \"User name of the user\"\n }, {\n \"name\": \"age\",\n \"type\": \"int\",\n \"doc\": \"Age of the user\"\n }, {\n \"name\": \"weight\",\n \"type\": \"float\",\n \"doc\": \"Weight of the user\"\n }, {\n \"name\": \"address\",\n \"type\": {\n \"type\": \"record\",\n \"name\": \"Address\",\n \"fields\": [{\n \"name\": \"street_address\",\n \"type\": \"string\"\n }, {\n \"name\": \"city\",\n \"type\": \"string\"\n }]\n }\n }],\n \"doc:\": \"Nothing to see here...\"\n}");
User user = new User();
user.setUserName("Tibi Kovacs");
user.setAge(25);
user.setWeight(((float) 32.12));
user.setAddress(new Address("FoxiMaxi St","Budapest"));
SpecificDatumWriter<User> avroEventWriter = new SpecificDatumWriter<User>(schema);
EncoderFactory avroEncoderFactory = EncoderFactory.get();
ByteArrayOutputStream stream = new ByteArrayOutputStream();
BinaryEncoder binaryEncoder = avroEncoderFactory.binaryEncoder(stream, null);
avroEventWriter.write(user, binaryEncoder);
binaryEncoder.flush();
IOUtils.closeQuietly(stream);
byte[] m = stream.toByteArray();
FileOutputStream fos = new FileOutputStream("/full/path/data/test3.java.avro");
fos.write(m);
fos.close();