如何将包转换为数值数组?

时间:2015-01-09 16:05:10

标签: hadoop apache-pig

我正在尝试转换以下架构:

{
  id: chararray,
  v: chararray,
  paid: chararray,
  ts: {(ts: int)}
}

进入以下JSON输出:

{
  "id": "abcdef123456",
  v:    "some identifier",
  paid: "another identifier",
  ts:   [ 1,2,3,4,5,6 ]
}

我知道如何生成JSON输出,但我无法弄清楚如何将Pig Schema中的ts属性转换为数值数组。

ts包中的商品数量已知,但它们都具有相同的架构(ts: int)

1 个答案:

答案 0 :(得分:0)

Pig不支持数组类型的数据类型,一个选项可能是你可以试试这样的。

<强>输入

1       1       100     {(1),(2),(3)}
2       2       200     {(4),(5)}
3       3       300     {(1),(2),(3),(4),(5),(6)}

<强> PigScript:

A = LOAD 'input' USING PigStorage() AS (id: chararray, v: chararray,paid: chararray,ts: {(ts: int)});
B = FOREACH A GENERATE id,v,paid,CONCAT('[',BagToString(ts,','),']') AS ts;
STORE B INTO 'output' USING JsonStorage();

<强>输出:

{"id":"1","v":"1","paid":"100","ts":"[1,2,3]"}
{"id":"2","v":"2","paid":"200","ts":"[4,5]"}
{"id":"3","v":"3","paid":"300","ts":"[1,2,3,4,5,6]"}