Pig Latin中的正则表达式

时间:2017-01-08 05:09:37

标签: regex apache-pig

我想在元组中搜索字符串'15200'(不带引号)。因此,对于以下输入:

15200
15200,4000
4000,15200
4000,15200,4025
152000
152000,4000
4000,152000
4000,152000,4025
115200
115200,4000
4000,115200
4000,115200,4025

输出应为:

15200,15200
15200,4000,15200
4000,15200,15200
4000,15200,4025,15200
152000,-1
152000,4000,-1
4000,152000,-1
4000,152000,4025,-1
115200,-1
115200,4000,-1
4000,115200,-1
4000,115200,4025,-1

我的猪代码如下:

A = LOAD '/user/test'  USING PigStorage()  AS (logic:chararray);
B = FOREACH A GENERATE
logic,
((logic matches '(^|,)15200($|,)')? '15200' :'-1') AS expt;

但是当我转储B时,我得到了:

(15200,15200)
(15200,4000,-1)
(4000,15200,-1)
(4000,15200,4025,-1)
(152000,-1)
(152000,4000,-1)
(4000,152000,-1)
(4000,152000,4025,-1)
(115200,-1)
(115200,4000,-1)
(4000,115200,-1)
(4000,115200,4025,-1)

1 个答案:

答案 0 :(得分:0)

试试这个:

.*?\b15200\b.*

正则表达式演示:https://regex101.com/r/n6EP1s/2