我的数据为:
{(2000),(1800),(2700)}
{(2014),(1500),(1900)} etc.
我创建了一个java UDF:
DataBag bag = (DataBag) top3.get(0);
Tuple categoryCode = null;
if(bag.size() == 0)
return null;
for(Iterator<Tuple> code=bag.iterator(); code.hasNext();)
categoryCode=code.next();
return categoryCode.get(0).toString();
我希望我的输出像:
2000,1800,2700
2014,1500,1900 etc
我的UDF输出为:
2000
2014 etc
请帮助是否有其他解决方案。请帮助您输入。
答案 0 :(得分:1)
实际上很简单,看看:
public class YourClass extends EvalFunc<String>{
@Override
public String exec(Tuple input) throws IOException {
DataBag bag = (DataBag)input.get(0);
Tuple categoryCode = null;
//Keep the count of every cell in the
Tuple auxiliary = TupleFactory.getInstance().newTuple(3);
int i = 0;
for(Iterator<Tuple> code=bag.iterator(); code.hasNext();) {
categoryCode=code.next();
//You can use append if don't know from the very beginning
//the size of tuple
auxiliary.set(i, categoryCode.get(0).toString());
i+=1;
}
return auxiliary.toDelimitedString(",");
}
}
您最好使用辅助元组来更轻松地执行操作,然后只使用实例方法toDelimitedString()
,非常简单。