我有一个包含以下行的文件:
3124,"hello...",ku4
3125,"hello,hi",ab2
我想加载文件,使其有三列。我使用PigStorage(',')
,但它也将"hello,hi"
拆分为两个。我希望它在一个领域内。
我怎样才能做到这一点?
答案 0 :(得分:0)
您可以编写自己的自定义UDF或使用piggybank.jar
中的CSVLoader-- Get piggybank.jar that is compatible with your pig version and register
it in your pig script by pointing to the location of the jar file
REGISTER piggybank.jar
A = LOAD 'test.txt' USING org.apache.pig.piggybank.storage.CSVLoader(',') AS (f1:int,f2:chararray,f3:chararray);
B = FOREACH A GENERATE f1, f2, f3;
DUMP B;