我在单个文件中有以下数据
"HD",003498,"20160913:17:04:10","D3ZYE",1
"EH","XXX-1985977-1",1,"01","20151215","20151215","20151229","20151215","2304",,,"36-126481000",1340.74,61808.00,1126.62,0.00,214.12,0.00,0.00,0.00,"30","20151229","00653845",,,"PARTS","001","ABI","20151215","Y","Y","N","36-126481000",
我想使用Pig来读取这个单个文件,然后根据第一列将其隔离到不同的文件中 同样,我一直在寻找一种方法来首先将记录视为以下结构:
recTypCd,recordData
然后稍后将recordData视为CSV记录
在这方面,我将它们存储在具有相同记录类型的单独文件后,我可以使用CSV serde将它们加载到自己的外部HIVE表中
答案 0 :(得分:0)
您可以根据您的情况在猪中使用拆分
e.g multiple = recTypeCd的分割线 当rectypecd =='hd'时的情况hd1, 案例hd2 ......
将hd1存储到op1; 将hd2存储到op2;