我试图在apache beam中实现Gamma Distribution。首先,我使用Apache beam的TextIO类读取CSV文件CSV文件:
Pipeline p = Pipeline.create();
p.apply(TextIO.read().from("gs://path/to/file.csv"));
之后,我应用一个转换来解析CSV文件中的每一行并返回一个对象。这里只有我尝试执行Gamma分配操作:
.apply(ParDo.of(new DoFn<String, Entity>() {
@ProcessElement
public void processElement(ProcessContext c) {
String[] strArr = c.element().split(",");
ClassxNorms xn = new ClassxNorms();
xn.setDuration(Double.parseDouble(strArr[0]));
xn.setAlpha(Double.parseDouble(strArr[1]));
xn.setBeta(Double.parseDouble(strArr[2]));
GammaDistribution gdValue = new GammaDistribution(Double.parseDouble(strArr[0]), Double.parseDouble(strArr[1]), Double.parseDouble(strArr[2]));
System.out.println("gdValue : " + gdValue);
c.output(xn);
}
}));
我创建了一个beamRecord,并在下一步中将光束记录转换为字符串,将最终输出写入Google存储:
PCollection<String> gs_output_final = xnorm_trig.apply(ParDo.of(new DoFn<BeamRecord, String>() {
private static final long serialVersionUID = 1L;
@ProcessElement
public void processElement(ProcessContext c) {
c.output(c.element().toString());
System.out.println(c.element().toString());
}
}));
gs_output_final.apply(TextIO.write().to("gs://output/op_1/Q40test111"));
我获得了输出但是伽马分配操作没有得到实现。任何帮助都会非常感激。
答案 0 :(得分:1)
我能够在apache beam中实现gamma分布。以下是供参考的代码段:
.apply(ParDo.of(new DoFn<String, ClassxNorms>() {
@ProcessElement
public void processElement(ProcessContext c) throws ParseException {
String[] strArr = c.element().split(",");
ClassxNorms xn = new ClassxNorms();
double sample = new GammaDistribution(Double.parseDouble(strArr[11]), Double.parseDouble(strArr[12])).cumulativeProbability(Double.parseDouble(strArr[6]));
xn.setDuration(Double.parseDouble(strArr[6]));
xn.setAlpha(Double.parseDouble(strArr[11]));
xn.setBeta(Double.parseDouble(strArr[12]));
xn.setVolume(Double.parseDouble(strArr[13]));
xn.setSpend(Double.parseDouble(strArr[14]));
xn.setEfficiency(Double.parseDouble(strArr[15]));
xn.setXnorm(Double.parseDouble(strArr[16]));
xn.setKey(strArr[17]);
xn.setGamma(sample);
c.output(xn);
}
}));