Question

我正在尝试将数据插入到我创建的Hive中的表中。我一直在努力，所以我试图尽可能地简化它以找到问题的根源。

这是我创建基本表的简化代码。我基本上有一个带有单个元素的结构数组。

DROP TABLE IF EXISTS foo.S_FILE_PA_JOB_DATA_T;

CREATE TABLE foo.S_FILE_PA_JOB_DATA_T
  PARTITIONED BY (customer_id string)
  STORED AS AVRO
  TBLPROPERTIES (
 'avro.schema.literal'=
 '{
   "namespace": "com.foo.oozie.foo",
   "name": "S_FILE_PA_JOB_DATA_T",
   "type": "record",
   "fields":
   [
      {"name":"pa_hwm"             ,"type":{
         "type":"array",
         "items":{
           "type":"record",
           "name":"pa_hwm_record",
           "fields":
           [
             {"name":"pa_axis"           ,"type":["int","null"]}
           ]
         }
      }}
   ]
   }');

我的问题是我无法弄清楚要插入表格的语法。

insert into table foo.s_FILE_PA_JOB_DATA_T partition (customer_id) values (0,'a390c1cf-4ee5-4ab9-b7a3-73f5f268b669')

0需要以某种方式成为array<struct<int>>，但我无法正确理解语法。有人可以帮忙吗？谢谢！

Answer 1

不幸的是，你不能直接这样做。另请参阅Hive inserting values to an array complex type column。

理论上，你应该可以使用像

这样的东西来做

insert into table s_file_pa_job_data_t partition(customer_id)  
  values (array(named_struct('pa_axis',0)) );

也就是说，使用array()和named_struct() udfs，它们将从一些标量值中分别构造一个数组，并根据您的规范构造一个结构。（请参阅此处的UDF文档：https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-ComplexTypeConstructors

但不幸的是，如果你这样做，你会得到

FAILED: SemanticException [Error 10293]: Unable to create temp file 
for insert values Expression of type TOK_FUNCTION not supported in insert/values

因为遗憾的是，hive还不支持在VALUES子句中使用UDF函数。正如其他帖子所暗示的那样，你可以使用虚拟表来实现，这有点难看，但很有效。

如何使用数据类型为array <struct <int>＆gt;的数据列插入Hive表

1 个答案: