当我尝试将以下数据拆分为两个分隔符,即|
和:
时,我遇到了一个问题,
数据:
1 | c1:11:33 | c2:12
234 | c1:21 | c2:22
33 | c1:31 | c2:32
345 | c1:41 | c2:42
猪脚本:
inpt = load '/home/hduser/test1' as (line:chararray);
splt = foreach inpt generate FLATTEN(STRSPLIT($0, '\\|')) ;
id_vals = foreach splt generate $0 as id, FLATTEN(TOBAG(*)) as value;
id_vals3 = foreach id_vals generate id, INDEXOF(value,':') as p, (tuple(chararray,int))STRSPLIT(value,':',2) as vals;
错误:
2015-03-26 18:54:45,724 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045:
<line 4, column 87> Could not infer the matching function for org.apache.pig.builtin.STRSPLIT as multiple or none of them fit. Please use an explicit cast.
问题在于 - STRSPLIT(value...
,它需要一个chararray,但是会得到一个NULL类型。
请提出建议或替代方法来解决此问题。