Question

以下是我在名为 temp_stat 的 Hive 表中推送的数据集：

COUNTRY    CITY                 TEMP 
---------- -------------------- -----
US         Arizona              51.7 
US         California           56.7 
US         Bullhead City        51.1 
India      Jaisalmer            42.4 
Libya      Aziziya              57.8 
Iran       Lut Desert           70.7 
India      Banda                42.4

当我尝试通过选择命令查看数据时，我得到以下数据集：

US,Arizona,51.7         NULL    NULL
US,California,56.7      NULL    NULL
US,Bullhead City,51.1   NULL    NULL
India,Jaisalmer,42.4    NULL    NULL
Libya,Aziziya,57.8      NULL    NULL
Iran,Lut Desert,70.7    NULL    NULL
India,Banda,42.4        NULL    NULL

接下来，我想将这些记录分组放在国家/地区，并获取每个国家/地区的最高温度以及城市名称，因此我运行了以下查询：

select country,city,temp
from (
select country,city,temp, 
     row_number() over (partition by country order by temp desc) as part
from temp_stat
) a 
where part = 1
order by country, city;

在 hive shell中运行上述查询后，我得到以下结果：

US,Arizona,51.7         NULL    NULL
US,California,56.7      NULL    NULL
US,Bullhead City,51.1   NULL    NULL
India,Jaisalmer,42.4    NULL    NULL
Libya,Aziziya,57.8      NULL    NULL
Iran,Lut Desert,70.7    NULL    NULL
India,Banda,42.4        NULL    NULL

即使我运行内部查询以生成 row_number ，我也会为所有记录获得类似的行号。（像这样：）

India,Banda,42.4        NULL    NULL    1
India,Jaisalmer,42.4    NULL    NULL    1
Iran,Lut Desert,70.7    NULL    NULL    1
Libya,Aziziya,57.8      NULL    NULL    1
US,Arizona,51.7         NULL    NULL    1
US,Bullhead City,51.1   NULL    NULL    1
US,California,56.7      NULL    NULL    1
enter code here

我还尝试过 dense_rank（）和 rank（）。没有新的结果。表定义有什么问题或什么？

所有帮助将不胜感激！

Answer 1

字段以'，'

结尾

你的表定义应该是这样的：

create external table temp_stat
(
    country     string   
   ,city        string          
   ,temp        decimal(11,1)
)
    row format delimited
    fields terminated by ','
;

select * from temp_stat;

+---------+---------------+------+
| country |     city      | temp |
+---------+---------------+------+
| US      | Arizona       | 51.7 |
| US      | California    | 56.7 |
| US      | Bullhead City | 51.1 |
| India   | Jaisalmer     | 42.4 |
| Libya   | Aziziya       | 57.8 |
| Iran    | Lut Desert    | 70.7 |
| India   | Banda         | 42.4 |
+---------+---------------+------+

Apache HIVE中的表定义问题

1 个答案: