我正在尝试在aws athena中执行以下select语句:
SELECT
col_1,
col_2
FROM "my_database"."my_table"
WHERE
partition_1='20171130'
AND
partition_2='Y'
LIMIT 10
我得到了错误:
Your query has the following error(s):
HIVE_CURSOR_ERROR: Can not read value at 0 in block 0 in file s3://my-s3-path/my-table/partition_1=20171130/partition_2=Y/part-1111-11111111-1111-1111-1111-111111111111.snappy.parquet
This query ran against the "my_database" database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: 1111111-1111-1111-1111-111111111111.
但是当我只删除一个列时,它就可以了!例如。选择1列工作:
SELECT col_1 FROM "my_database"."my_table" WHERE partition_1='20171130' AND partition_2='Y' LIMIT 10
SELECT col_2 FROM "my_database"."my_table" WHERE partition_1='20171130' AND partition_2='Y' LIMIT 10
此外,我发现我可以为select语句添加多于1列,但它只是失败了一些组合。但为什么? 表定义是:
{
"Table": {
"StorageDescriptor": {
"OutputFormat": "org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat",
"SortColumns": [],
"InputFormat": "org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat",
"SerdeInfo": {
"SerializationLibrary": "org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe",
"Parameters": {
"serialization.format": "1"
}
},
"BucketColumns": [],
"Parameters": {
"CrawlerSchemaDeserializerVersion": "1.0",
"compressionType": "none",
"UPDATED_BY_CRAWLER": "myCrawler",
"classification": "parquet",
"recordCount": "40190451",
"typeOfData": "file",
"CrawlerSchemaSerializerVersion": "1.0",
"objectCount": "18",
"averageRecordSize": "35",
"exclusions": "[\"s3://my-s3-path/my_table/_**\"]",
"sizeKey": "1078884110"
},
"Location": "s3://my-s3-path/my-table/",
"NumberOfBuckets": -1,
"StoredAsSubDirectories": false,
"Columns": [
{
"Type": "smallint",
"Name": "col_1"
},
{
"Type": "decimal(18,6)",
"Name": "col_2"
}
],
"Compressed": false
},
"UpdateTime": 1515503623.0,
"PartitionKeys": [
{
"Type": "string",
"Name": "partition_1"
},
{
"Type": "string",
"Name": "partition_2"
}
],
"Name": "my_table",
"Parameters": {
"CrawlerSchemaDeserializerVersion": "1.0",
"compressionType": "none",
"UPDATED_BY_CRAWLER": "myCrawler",
"classification": "parquet",
"recordCount": "40190451",
"typeOfData": "file",
"CrawlerSchemaSerializerVersion": "1.0",
"objectCount": "18",
"averageRecordSize": "35",
"exclusions": "[\"s3://my-s3-path/my_table/_**\"]",
"sizeKey": "1078884110"
},
"LastAccessTime": 1515503623.0,
"CreatedBy": "arn:aws:sts::111111111111:assumed-role/MyRole/myCrawler",
"TableType": "EXTERNAL_TABLE",
"Owner": "owner",
"CreateTime": 1515503623.0,
"Retention": 0
}
}
答案 0 :(得分:0)
通常,此错误是由于文件中的架构或数据不匹配造成的。对于实木复合地板,请确保文件的架构完全匹配Hive架构 。甚至列的顺序有时也很重要,例如结构。
此外,如果输出文件行中不包含该列的任何数据,则拼写作者有时会删除稀疏字段。在分区的情况下这很常见。