在HIVE中运行JOIN查询
SELECT c.id_double, o.id_double
FROM hive_testQ1 c JOIN hive_testQ2 o
ON (c.id = o.id) limit 3;
我得到的输出是:
1.2 1.2
2.2 2.2
3.2 3.2
但第一个表的值为 .1 ,第二个表的值为 .2
启动的地图作业数量仅为1。
为什么只读取第二个表中的值?`
第一张表是:
+------+-----------+---------+------+
| id | id_double | names | test |
+------+-----------+---------+------+
| 1 | 1.2 | Frank | 4 |
| 2 | 2.2 | Jamie | 5 |
| 3 | 3.2 | Dale | 6 |
| 4 | 4.2 | Eric | 7 |
| 5 | 5.2 | Felipa | 8 |
| 6 | 6.2 | Eric | 9 |
| 7 | 7.2 | Betty | 10 |
| 8 | 8.2 | Lena | 11 |
| 9 | 9.2 | Scott | 12 |
| 10 | 10.2 | Juanita | 13 |
+------+-----------+---------+------+
第二张表:
+------+-----------+---------+
| id | id_double | names |
+------+-----------+---------+
| 1 | 1.1 | Rueben |
| 2 | 2.1 | Anthony |
| 3 | 3.1 | Nancy |
| 4 | 4.1 | Sandra |
| 5 | 5.1 | Ann |
| 6 | 6.1 | Myra |
| 7 | 7.1 | Donna |
| 8 | 8.1 | Ron |
| 9 | 9.1 | Allen |
| 10 | 10.1 | Jose |
+------+-----------+---------+