猪:过滤掉空字符串

时间:2015-10-06 23:48:47

标签: hadoop apache-pig

我正在尝试从我的数据中过滤掉NULL和空字符串

data_filtered = FILTER raw_data by COLUMN_NAME is not null and COLUMN_NAME != '' ;

当我运行它时,我收到以下错误:

ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <file jhoughton/temp/temp_script.pig, line 43, column 46>  Unexpected character ' '

如何解决此错误并过滤掉NULLS和空字符串?

2 个答案:

答案 0 :(得分:1)

(In-)字符串的平等不是通过猪中的!=或==建立的。

正确的语法是:

data_filtered = FILTER raw_data BY (COLUMN_NAME is not null) AND  NOT(COLUMN_NAME MATCHES "");

答案 1 :(得分:0)

您可以使用TRIM函数来过滤空格

data_filtered = FILTER raw_data by ( COLUMN_NAME is not null and TRIM(COLUMN_NAME) != '' );