我有一个示例猪脚本,其中的数据将读取csv文件并将其转储到屏幕上;但是,我的数据有名称值对。如何读取一行名称值对并使用字段名称和值的值拆分对?
数据:
1,Smith,Bob,Business Development
2,Doe,John,Developer
3,Jane,Sally,Tester
脚本:
data = LOAD 'example-data.txt' USING PigStorage(',')
AS (id:chararray, last_name:chararray,
first_name:chararray, role:chararray);
DESCRIBE data;
DUMP data;
输出:
data: {id: chararray,last_name: chararray,first_name: chararray,role: chararray}
(1,Smith,Bob,Business Development)
(2,Doe,John,Developer)
(3,Jane,Sally,Tester)
但是,给定以下输入(作为名称值对);我如何处理数据以获得相同的“数据对象”?
id=1,last_name=Smith,first_name=Bob,role=Business Development
id=2,last_name=Doe,first_name=John,role=Developer
id=3,last_name=Jane,first_name=Sally,role=Tester
答案 0 :(得分:0)
请参阅STRSPLIT
A = LOAD 'example-data.txt' USING PigStorage(',') AS (f1:chararray,f2:chararray,f3:chararray, f4:chararray);
B = FOREACH A GENERATE
FLATTEN(STRSPLIT(f1,'=',2)) as (n1:chararray,v1:chararray),
FLATTEN(STRSPLIT(f2,'=',2)) as (n2:chararray,v2:chararray),
FLATTEN(STRSPLIT(f3,'=',2)) as (n3:chararray,v3:chararray),
FLATTEN(STRSPLIT(f4,'=',2)) as (n4:chararray,v4:chararray);
C = FOREACH B GENERATE v1,v2,v3,v4;
DUMP C;