输入数据如下:
SQL> with test as
2 (select 'L1' col from dual union
3 select 'L2' col from dual union
4 select 'L3A' col from dual union
5 select 'L3B' col from dual union
6 select 'L4' col from dual union
7 select 'L6C' col from dual union
8 select 'L8' col from dual union
9 select 'L9' col from dual union
10 select 'L10' col from dual union
11 select 'L11' col from dual union
12 select 'R1D' col from dual union
13 select 'R2A' col from dual union
14 select 'R2B' col from dual union
15 select 'R2Z' col from dual union
16 select 'R11' col from dual)
17 select col from test
18 order by
19 substr(col, 1, 1),
20 to_number(regexp_substr(col, '\d+', 1, 1)),
21 regexp_substr(col, '\w', 1, 3) desc;
COL
---
L1
L2
L3B
L3A
L4
L6C
L8
L9
L10
L11
R1D
R2Z
R2B
R2A
R11
15 rows selected.
SQL>
脚本如下:
(1,a,1,2)
(2,a,2,4)
(5,a,7,5)
(6,a,3,1)
(8,a,4,3)
(3,a,8,6)
(7,a,5,8)
(4,a,6,7)
输出如下:
a = load '/tmp/data/data' using PigStorage(',') as (timestamp:chararray,constant:chararray,data1:chararray,data2:chararray);
b = FOREACH (GROUP a BY(constant)){
ord4 = ORDER a BY timestamp DESC;
top4 = LIMIT ord4 1;
GENERATE FLATTEN(top4),MAX(a.data1) as data,MAX(a.data2) as data2;}
g4 = FOREACH b GENERATE top4::timestamp AS timestamp,
top4::constant AS constant,
top4::data1 AS curr_data1,
top4::data2 AS curr_data2,
data1 as data1,
data2 as data2;
dump g4;
还需要data1的时间戳为3,data2为7。
如下所示:
(8,a,4,3,8,8)
你能否告诉你如何实现这一目标?
非常感谢提前。
答案 0 :(得分:0)
您只提供了6个字段 g4 ,因此输出((8,a,4,3),8,8))。
当你说
时请更具体还需要data1的时间戳为3,data2为7。
最后两个字段的预期结果如何。