插入覆盖目录问题

时间:2019-02-25 14:17:00

标签: hive hiveql hive-query

1)

insert overwrite directory `'/user/sample/newfolder'` 

row format delimited

fields terminated by ', '

select * from emp;

给我没有标题的数据。即使使用设置hive.cli.print.header = true;

我尝试做hive -e 'set hive.cli.print.header=true;select * from emp;' > /user/sample/newfolder/sample.xls -说不出来:没有这样的文件或目录

2)每条记录的数据将转到另一行。如何将其限制为一行?

ex: 1, ppp, ddd,44,

45,www

但我希望它为1,ppp,ddd,44,45,www

1 个答案:

答案 0 :(得分:0)

尚不支持在执行插入覆盖目录时添加标头,请参阅此Jira

您可以将输出文件与头文件连接起来:

hadoop fs -cat /user/dir/header.csv /user/dir/output_file.csv | hadoop fs -put - /user/dir/output_w_header.csv

或像这样重写您的选择查询(ORDER BY将触发单个最终减速器,并且可能运行缓慢):

select * from 
(
select --header
      0           as order_col
      'col1_name' as col1,
      'col2_name' as col2,
       ...
      'colN_name' as colN
UNION ALL 
select --data
       1                    order_col,
       cast(col1 as string) col1, --cast to strings
       col2, ... coln 
  from emp
)s 
order by order_col;