collect_list()是否保证保持行的相对排序?

时间:2017-12-24 01:19:25

标签: hive hiveql

想象一下,我有以下的配置表:

+---+-----------+------------+
| id|featureName|featureValue|
+---+-----------+------------+
|id1|          a|           3|
|id1|          b|           4|
|id2|          a|           2|
|id2|          c|           5|
|id3|          d|           9|
+---+-----------+------------+

现在我运行一个类似下面例子的查询

SELECT 
  id,
  collect_list(idx),
  collect_list(val)
FROM
  ...
GROUP BY id

我是否保证会将“idx”和“val”汇总并保持相对顺序?即。

GOOD                   GOOD                   BAD                    
+---+------+------+    +---+------+------+    +---+------+------+    
| id|   idx|   val|    | id|   idx|   val|    | id|   idx|   val|    
+---+------+------+    +---+------+------+    +---+------+------+    
|id3|   [d]|   [9]|    |id3|   [d]|   [9]|    |id3|   [d]|   [9]|    
|id1|[a, b]|[3, 4]|    |id1|[b, a]|[4, 3]|    |id1|[a, b]|[4, 3]|    
|id2|[a, c]|[2, 5]|    |id2|[c, a]|[5, 2]|    |id2|[a, c]|[5, 2]|    
+---+------+------+    +---+------+------+    +---+------+------+    

注意:例如这是不好的,因为id1 [a,b]应该与[3,4](而不是[4,3])相关联。对于id2

也是如此

0 个答案:

没有答案