Hive查询语法

时间:2018-02-05 14:15:31

标签: hive hiveql

我是Hive的新手,请帮助使用语法。下面是表logstash中的2列(filepath,filesize(bytes))....

safe_click(driver.find_element_by_xpath("x")  # Leaving max_time equal to 10 seconds
safe_click(driver.find_element_by_xpath("y", max_time=5)  # Explicitly setting max_time to 5 seconds
safe_click(driver.find_element_by_xpath("z", max_time=9999)  # Explicitly setting max_time to 9999 seconds

我能够将文件总数提升到第一级,....

同样如何提取第二级,例如:(/ data / abc,/ bns / ghi,/ tmp / cbd) 例如;如果/ data是100 GB我需要知道内部/数据是什么/ data / def = 20 GB / data / efg = 20 GB ......类似的第3级

/bns/ghi/cod/cob_def/abc                        | 10600
/sandbox/abc/def/xyz/ade                        | 1062659
/data/def/cag/tyz/gj/ibs                        | 457869
/tmp/cdb/def/ghik/new_data/2018-08-17           | 14565
/data/abc/def/ghi/new_data                      | 56453

2 个答案:

答案 0 :(得分:0)

@ user9314128;请尝试以下查询:希望它有所帮助。感谢

select filepath
,sum(filesize) as sumfilesize
from logstash 
where length(regexp_replace(filepath,'[^/]','')) = 1
group by filepath;

第二级;将where子句从= 1更改为= 2

答案 1 :(得分:0)

使用Split和CONCAT_WS

1级

select split(filepath, '/')[0] as 1_level_path,filesize from logstash;

1个文件大小的级别。

select split(filepath, '/')[0] as 1_level_path,SUM(filesize) 
from logstash
group by split(filepath, '/')[0] ;

2个级别

select CONCAT_WS('/',split(filepath, '/')[0],split(filepath, '/')[1]),filesize from logstash;

3个级别

 select CONCAT_WS('/',CONCAT_WS('/',split(filepath, '/')[0],split(filepath, '/')[1]),split(filepath, '/')[2])),filesize from logstash