Question

我在S3存储桶中存储了几个avro文件，并使用AvroSerde通过AWS Glue对它们进行爬网，完美地粘合了索引文件。

当我使用AWS athena这样查询表时：select * from mytable我得到了所有预期的结果，例如，“值”列显示为[23.4,345.6]。

但是，当我尝试选择“值”列（或尝试在其上使用任何数组函数）时，没有任何结果，并且雅典娜的查询选项卡显示“失败”图标，但是我没有任何错误。< / p>

查询失败的示例：select values from mytable

雅典娜describe mytable;的输出

timestamp               bigint                  from deserializer   
values                  array<float>            from deserializer   
account                 string                                      
year                    string                                      
month                   string                                      
day                     string                                      
hour                    string                                      
device                  string                                      
type                    string                                      

# Partition Information      
# col_name              data_type               comment             

account                 string                                      
year                    string                                      
month                   string                                      
day                     string                                      
hour                    string                                      
device                  string                                      
type                    string

AWS Glue中的表定义：

Name         mytable
Database     new_formats_test
Classification  avro
Location     s3://mytable/
Input format    org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat
Output format   org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
Serde serialization lib org.apache.hadoop.hive.serde2.avro.AvroSerDe
Serde parameters    
avro.schema.literal {"type":"record","name":"ArchiveDataEntry","namespace":"com.epihunter.data.archive","fields":[{"name":"timestamp","type":"long"},{"name":"values","type":{"type":"array","items":"float"}}]}serialization.format 1
Table properties    
sizeKey 18525162objectCount 1UPDATED_BY_CRAWLER Scan S3 new formats
avro.schema.literal {"type":"record","name":"ArchiveDataEntry","namespace":"com.epihunter.data.archive","fields":[{"name":"timestamp","type":"long"},{"name":"values","type":{"type":"array","items":"float"}}]}

Answer 1

要回答我自己的问题：values是导致失败的保留关键字。使用"values"之类的双引号或显然重命名该列都可以解决此问题

使用AWS Athena从avro文件读取数组没有结果和未知错误

1 个答案: