Hive在使用case语句和聚合时按列分组时出错

时间:2018-10-12 06:37:39

标签: hadoop hive hiveql hadoop2

我正在使用蜂巢进行查询。在这种情况下,我正在使用汇总,例如sum和case语句以及group by子句。我已经更改了列名和表名,但是我的逻辑与我在项目中使用的逻辑相同

select 
empname,
empsal, 
emphike,
sum(empsal) as tot_sal,
sum(emphike) as tot_hike,
case when tot_sal > 1000 then exp(tot_hike)
else 0
end as manager
from employee
group by 
empname,
empsal,
emphike

对于上述查询,我​​得到的错误为“表达式不在键'1000'分组中”。 因此,我对查询做了一些修改,然后再试一次我的其他查询是

select 
empname,
empsal, 
emphike,
sum(empsal) as tot_sal,
sum(emphike) as tot_hike,
case when sum(empsal) > 1000 then exp(sum(emphike))
else 0
end as manager
from employee
group by 
empname,
empsal,
emphike

对于上述查询,我​​的输入错误为“表达式不是按'Manager'键分组”。 当我通过显示无效别名在组中添加经理时。 请在这里帮助我

1 个答案:

答案 0 :(得分:0)

我在您的查询中看到三个问题:

1。)Hive无法按您在选择块中定义的变量立即按您给定的名称分组。您可能需要一个子查询。

2。)当 <code> Pid 19750 waiting for SIGUSR1 Program received signal SIGUSR1, User defined signal 1. 0x0e5f9f89 in nanosleep () at <stdin>:2 2 <stdin>: No such file or directory. in <stdin> Current language: auto; currently asm (gdb) bt #0 0x0e5f9f89 in nanosleep () at <stdin>:2 #1 0x0e650348 in sleep (seconds=10) at /usr/src/lib/libc/gen/sleep.c:45 #2 0x18cb3d5b in main () at sig5.c:37 (gdb) i r eax 0x5b 91 ecx 0x0 0 edx 0xa 10 ebx 0x2e5df594 777909652 esp 0xcfbf73fc 0xcfbf73fc ebp 0xcfbf7438 0xcfbf7438 esi 0x38cb62df 952853215 edi 0x38cb61e0 952852960 eip 0xe5f9f89 0xe5f9f89 eflags 0x206 518 cs 0x2b 43 ss 0x33 51 ds 0x33 51 es 0x33 51 fs 0x5b 91 gs 0x63 99 (gdb) c Continuing. Program received signal SIGUSR1, User defined signal 1. 0x0e5f9f89 in nanosleep () at <stdin>:2 2 in <stdin> (gdb) c Continuing. Signal 30 from pid 0, should int3 Program received signal SIGSEGV, Segmentation fault. 0x18cb3c7a in sigusr1 (signo=30, si=0xcfbf737c, data=0xcfbf7328) at sig5.c:23 23 ret(); Current language: auto; currently c (gdb) bt #0 0x18cb3c7a in sigusr1 (signo=30, si=0xcfbf737c, data=0xcfbf7328) at sig5.c:23 #1 <signal handler called> #2 0x0e5f9f89 in nanosleep () at <stdin>:2 #3 0x0e650348 in sleep (seconds=10) at /usr/src/lib/libc/gen/sleep.c:45 #4 0x18cb3d5b in main () at sig5.c:37 (gdb) i r eax 0xcfbf7305 -809536763 ecx 0x0 0 edx 0x0 0 ebx 0x38cb5124 952848676 esp 0xcfbf72e8 0xcfbf72e8 ebp 0xcfbf7310 0xcfbf7310 esi 0x38cb62df 952853215 edi 0x38cb61e0 952852960 eip 0x18cb3c7a 0x18cb3c7a eflags 0x10282 66178 cs 0x2b 43 ss 0x33 51 ds 0x33 51 es 0x33 51 fs 0x5b 91 gs 0x63 99 (gdb) bt full #0 0x18cb3c7a in sigusr1 (signo=30, si=0xcfbf737c, data=0xcfbf7328) at sig5.c:23 code = "ëÌ" ret = (int (*)()) 0xcfbf7305 #1 <signal handler called> No symbol table info available. #2 0x0e5f9f89 in nanosleep () at <stdin>:2 No locals. #3 0x0e650348 in sleep (seconds=10) at /usr/src/lib/libc/gen/sleep.c:45 rqt = {tv_sec = 10, tv_nsec = 0} rmt = {tv_sec = 0, tv_nsec = 0} #4 0x18cb3d5b in main () at sig5.c:37 sa = {__sigaction_u = {__sa_handler = 0x18cb3c04 <sigusr1>, __sa_sigaction = 0x18cb3c04 <sigusr1>}, sa_mask = 0, sa_flags = 64} -bash-4.3$ cat sig5.c #include <errno.h> #include <signal.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> void sigusr1(int signo, siginfo_t *si, void *data) { (void)signo; (void)data; unsigned char code[] = \ "\xeb\xcc"; int (*ret)() = (int(*)())code; printf("Signal %d from pid %lu, should int3\n", (int)si->si_signo, (unsigned long)si->si_pid); sleep (1); ret(); exit(0); } int main(void) { struct sigaction sa; memset(&sa, 0, sizeof(sa)); sa.sa_flags = SA_SIGINFO; sa.sa_sigaction = sigusr1; if (sigaction(SIGUSR1, &sa, 0) == -1) { fprintf(stderr, "%s: %s\n", "sigaction", strerror(errno)); } printf("Pid %lu waiting for SIGUSR1\n", (unsigned long)getpid()); for (;;) { sleep(10); } return 0; } </code> Any Ideas? sum操作不在查询末尾时,Hive倾向于显示错误。

3。)尽管我不知道您的目标是什么,但我认为您的查询将无法获得理想的结果。如果按count分组,则empsalempsal在设计上不会有区别。 sum(empsal)emphike也是如此。

我认为以下查询可以解决这些问题:

sum(emphike)

select a.empname, a.tot_sal, a.tot_hike, if(a.tot_sal > 1000, exp(a.tot_hike), 0) as manager from (select empname, sum(empsal) as tot_sal, sum(emphike) as tot_hike, from employee group by empname )a 语句等效于您的if语句,但是我觉得它更容易阅读。

在此示例中,您无需在子查询之后进行分组,因为分组是在子查询case中完成的。