在配置单元表的顶部添加一些行

时间:2019-06-12 14:00:55

标签: hadoop hive mapreduce bigdata hiveql

我在蜂巢中有一个这种形式的表格(之前):

AB_dimp|SF_0060H00000nhSrmQAE|EBA Order 1127735|Execute|New From
AB_dimp|SF_0060H00000nhSwkQAE|EBA Order 1127725|Execute|New From
AB_Dimp|SF_0060H00000nhSyDQAU|EBA Order 1127728|Execute|New From

我想以以下形式(之后)将这3行显示在该配置单元的表的顶部:

[Yellow]
Cat ID|AN_Net|
[network]
AB_dimp|SF_0060H00000nhSkPQAU|EBA Order 1127708|Execute|New From
AB_DIMP|SF_0060H00000nhSl8QAE|EBA Order 1127709|Execute|New From
AB_DIMP|SF_0060H00000nhSrmQAE|EBA Order 1127735|Execute|New From

请问如何在Hive中实现这一目标?

2 个答案:

答案 0 :(得分:1)

a。)首先,创建另一个表(例如,NewTable)并插入这3条记录

b。)现在,将现有数据插入另一个表中

insert overwrite table NewTable select * from ExisitngTable;

c。)删除ExisitngTable

d。)现在将数据从NewTable插入到ExisitngTable

insert overwrite table ExisitngTable select * from NewTable name;

答案 1 :(得分:0)

全部使用联合:

select '[Yellow]' as col_name union all
select 'ID|AN_Net|'           union all
select '[network]'            union all
select col_name from your_table;

如果要在表中添加这些行,不仅可以选择它们,而且不需要中间表即可实现:

insert overwrite your_table 
select * from 
(
    select '[Yellow]' as col_name union all
    select 'ID|AN_Net|'           union all
    select '[network]'            union all
    select col_name from your_table
)s;

但是请记住,表中的行不是有序的。当选择不带order by的表时,select在许多映射器上并行执行。基础文件正在拆分,并且映射器读取每个自己的拆分。它们彼此完全隔离地执行,并且返回结果也独立。您会看到,返回结果的速度更快,您只能看到order by保证返回的行的顺序。这意味着,下次您以某种可能性选择该表时,可能会返回这些其他行而不是第一行。只有ORDER BY可以保证行的顺序。并且您需要具有一些可用于对行进行排序的列,例如id,或者您的列可用于order by。 如果表很小,则有可能在单个映射器上读取它,并且将以原始顺序返回行,就像在基础文件中一样。

要保留文件中行的顺序,您可以添加row_order列,并在ORDER BY的上部查询中使用它:

select  DRM_Pln_Parent, opportunityid, opportunity_name
   from
   (
   SELECT 1 as row_order, '[hier]' as DRM_Pln_Parent, '' as opportunityid, '' as opportunity_name
UNION ALL
   SELECT 2 as row_order, 'Opportunity ID|SF_AllOpportunities|' as DRM_Pln_Parent, '' as opportunityid, '' as opportunity_name
UNION ALL
   SELECT 3 as row_order, '[relation]' as DRM_Pln_Parent, '' as opportunityid, '' as opportunity_name
UNION ALL 
   SELECT DISTINCT 4 as row_order, 'SF_AllOpportunities' AS DRM_Pln_Parent, 
CONCAT('SF_',opportunityid) as opportunityid, 
opportunity_name, 
from ...

   )s
order by row_order  

为进一步了解,请参见以下答案:https://stackoverflow.com/a/43368113/2700344