java.io.IOException:不是HDFS上的数据文件Avro

时间:2019-05-03 17:40:51

标签: scala hdfs avro

我正在尝试从HDFS读取Avro文件。我已经检查它们是否存在于数据节点上,并且可以使用hdfs dfs -cat命令读取它们。

但是,当我尝试在Scala中读取数据时,出现此异常:

Exception in thread "main" java.io.IOException: Not a data file.
    at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
    at org.apache.avro.file.DataFileStream.<init>(DataFileStream.java:84)
    at spark_test.TestSparkJob$.main(TestSparkJob.scala:55)
    at spark_test.TestSparkJob.main(TestSparkJob.scala)
Caused by: java.io.EOFException
    at org.apache.avro.io.BinaryDecoder$InputStreamByteSource.readRaw(BinaryDecoder.java:827)
    at org.apache.avro.io.BinaryDecoder.doReadBytes(BinaryDecoder.java:349)
    at org.apache.avro.io.BinaryDecoder.readFixed(BinaryDecoder.java:302)
    at org.apache.avro.io.Decoder.readFixed(Decoder.java:150)
    at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:100)
    ... 3 more

可能是什么原因?

这是我用于读取Avro文件的代码:

val fsInputStream = fs.open(new Path("/data/avro_static.avro"))
val datumReader = new GenericDatumReader[GenericRecord]()

val inStream = new BufferedInputStream(fsInputStream)
val fileReader = new DataFileStream(inStream, datumReader)

println("Schema " + fileReader.getSchema.toString())

hdfs -dfs -cat命令的结果:

Objavro.schema�{"type":"record","name":"TestData","namespace":"sample","fields":[{"name":"random_pk","type":["null",{"type":"bytes","logicalType":"decimal","precision":38,"scale":0}]},{"name":"random_string","type":["string","null"]},{"name":"code","type":["string","null"]},{"name":"random_bool","type":["boolean","null"]},{"name":"random_int","type":["int","null"]},{"name":"random_float","type":["double","null"]},{"name":"random_double","type":["double","null"]},{"name":"random_enum","type":["null",{"type":"enum","name":"enumType","symbols":["VAL_1","VAL_2","VAL_3"]}]},{"name":"random_date","type":["null",{"type":"int","logicalType":"date"}]},{"name":"random_decimal","type":["null",{"type":"bytes","logicalType":"decimal","precision":4,"scale":2}]},{"name":"update_database_time","type":["null",{"type":"long","logicalType":"timestamp-millis"}]},{"name":"update_database_time_tz","type":["null",{"type":"long","logicalType":"timestamp-millis"}]},{"name":"random_money","type":["null",{"type":"bytes","logicalType":"decimal","precision":19,"scale":4}]}]}avro.codec
g�9���E>����this word7,5,1,4,6@`f�D@=                                   snappy���
g�9���E># ����޲Z���ײZ���that word2,5,4,8���؆@��Q���@���Л�޲Z��翲ZV��������

0 个答案:

没有答案