猪拉丁语排序

时间:2014-12-26 13:52:00

标签: sorting apache-pig hadoop2

我正在测试Apache Pig文档中的以下示例:

http://pig.apache.org/docs/r0.14.0/basic.html#order-by

但排序功能似乎无效。有什么想法吗?

$ pig -version
Apache Pig version 0.14.0 (r1640057) 
compiled Nov 16 2014, 18:02:05


grunt> a= load 'data' as (c1:int, c2:int, c3:int);
grunt> describe a;
a: {c1: int,c2: int,c3: int}
grunt> dump a;
(1,2,3)
(4,2,1)
(8,3,4)
(4,3,3)
(7,2,5)
(8,4,3)
grunt> result = order a by c1 desc;
grunt> dump result; 
(8,4,3)
(8,3,4)
(4,3,3)
(4,2,1)
(1,2,3)
(7,2,5)
grunt> result = order a by c2 desc;
grunt> dump result;
(8,4,3)
(7,2,5)
(4,2,1)
(1,2,3)
(4,3,3)
(8,3,4)
grunt> result = order a by c3 desc;
grunt> dump result;                
(7,2,5)
(4,3,3)
(8,4,3)
(1,2,3)
(4,2,1)
(8,3,4)

1 个答案:

答案 0 :(得分:0)

您正在使用默认分隔符(tab)加载数据,但实际输入数据未通过选项卡正确分隔。您能否确保输入数据字段由文件'data'中的选项卡分隔。

在下面的示例中,每个输入字段都由制表符分隔,并且工作正常。

data:
1<TAB>2<TAB>3
4<TAB>2<TAB>1
8<TAB>3<TAB>4
4<TAB>3<TAB>3
7<TAB>2<TAB>5
8<TAB>4<TAB>3

grunt> a= load 'data' as (c1:int, c2:int, c3:int);
(1,2,3)
(4,2,1)
(8,3,4)
(4,3,3)
(7,2,5)
(8,4,3)
grunt> result = order a by c1 desc;
grunt> dump result;
(8,3,4)
(8,4,3)
(7,2,5)
(4,2,1)
(4,3,3)
(1,2,3)
grunt> result = order a by c2 desc;
grunt> dump result;
(8,4,3)
(8,3,4)
(4,3,3)
(1,2,3)
(4,2,1)
(7,2,5)
grunt> result = order a by c3 desc;
grunt> dump result; 
(7,2,5)
(8,3,4)
(1,2,3)
(4,3,3)
(8,4,3)
(4,2,1)