将列数据分隔到配置单元中的多个列

时间:2018-05-28 08:08:28

标签: hive hiveql hue hive-query

我有一个包含两个控制器及其版本的设备的示例数据。样本数据如下:

device_id   controller_id  versions
123          1             0.1
123          2             0.15
456          2             0.25
143          1             0.35
143          2             0.36

以上数据应采用以下格式:

device_id   1st_ctrl_id_ver   2nd_ctrl_id_ver
123          0.1              0.15
456          NULL             0.25
143          0.35             0.36

我使用了以下不起作用的代码:

select
device_id,
case when controller_id="1" then versions end as 1st_ctrl_id_ver,
case when controller_id="2" then versions end as 2nd_ctrl_id_ver       
from device_versions

我得到的输出是:

device_id   1st_ctrl_id_ver   2nd_ctrl_id_ver
123          0.1              NULL
123          NULL             0.15
456          NULL             0.25
143          0.35             NULL
143          NULL             0.36

我不想在每一行都有Null值。有人可以帮我写出正确的代码吗?

1 个答案:

答案 0 :(得分:1)

To "fold" all lines with a given key to a single line, you have to run an aggregation. Even if you don't really aggregate values in practise.

Something like
select device_id,
MAX(case when controller_id="1" then versions end) as 1st_ctrl_id_ver,
MAX(case when controller_id="2" then versions end) as 2nd_ctrl_id_ver
from device_versions
GROUP BY device_id

But be aware that this code will work if and only if you have at most one entry per controller per device, and any controller with a version higher than 2 will be ignored. In other words it is rather brittle (but you can't do better in SQL anway)