我正在尝试构建一个用于解压缩文件的自定义接收器。
拥有这个简单的代码:
public static class ZipIO{
public static class Sink extends com.google.cloud.dataflow.sdk.io.Sink<String> {
private static final long serialVersionUID = -7414200726778377175L;
private final String unzipTarget;
public Sink withDestinationPath(String s){
if(s!=""){
return new Sink(s);
}
else {
throw new IllegalArgumentException("must assign destination path");
}
}
protected Sink(String path){
this.unzipTarget = path;
}
@Override
public void validate(PipelineOptions po){
if(unzipTarget==null){
throw new RuntimeException();
}
}
@Override
public ZipFileWriteOperation createWriteOperation(PipelineOptions po){
return new ZipFileWriteOperation(this);
}
}
private static class ZipFileWriteOperation extends WriteOperation<String, UnzipResult>{
private static final long serialVersionUID = 7976541367499831605L;
private final ZipIO.Sink sink;
public ZipFileWriteOperation(ZipIO.Sink sink){
this.sink = sink;
}
@Override
public void initialize(PipelineOptions po) throws Exception{
}
@Override
public void finalize(Iterable<UnzipResult> writerResults, PipelineOptions po) throws Exception {
long totalFiles = 0;
for(UnzipResult r:writerResults){
totalFiles +=r.filesUnziped;
}
LOG.info("Unzipped {} Files",totalFiles);
}
@Override
public ZipIO.Sink getSink(){
return sink;
}
@Override
public ZipWriter createWriter(PipelineOptions po) throws Exception{
return new ZipWriter(this);
}
}
private static class ZipWriter extends Writer<String, UnzipResult>{
private final ZipFileWriteOperation writeOp;
public long totalUnzipped = 0;
ZipWriter(ZipFileWriteOperation writeOp){
this.writeOp = writeOp;
}
@Override
public void open(String uID) throws Exception{
}
@Override
public void write(String p){
System.out.println(p);
}
@Override
public UnzipResult close() throws Exception{
return new UnzipResult(this.totalUnzipped);
}
@Override
public ZipFileWriteOperation getWriteOperation(){
return writeOp;
}
}
private static class UnzipResult implements Serializable{
private static final long serialVersionUID = -8504626439217544799L;
public long filesUnziped=0;
public UnzipResult(long filesUnziped){
this.filesUnziped=filesUnziped;
}
}
}
}
处理失败,错误:
线程中的异常&#34; main&#34; java.lang.IllegalArgumentException:无法setCoder(null) 在com.google.cloud.dataflow.sdk.values.TypedPValue.setCoder(TypedPValue.java:67) 在com.google.cloud.dataflow.sdk.values.PCollection.setCoder(PCollection.java:150) 在com.google.cloud.dataflow.sdk.io.Write $ Bound.createWrite(Write.java:380) 在com.google.cloud.dataflow.sdk.io.Write $ Bound.apply(Write.java:112) 在com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner $ BatchWrite.apply(DataflowPipelineRunner.java:2118) 在com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner $ BatchWrite.apply(DataflowPipelineRunner.java:2099) 在com.google.cloud.dataflow.sdk.runners.PipelineRunner.apply(PipelineRunner.java:75) 在com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner.apply(DataflowPipelineRunner.java:465) 在com.google.cloud.dataflow.sdk.runners.BlockingDataflowPipelineRunner.apply(BlockingDataflowPipelineRunner.java:169) 在com.google.cloud.dataflow.sdk.Pipeline.applyInternal(Pipeline.java:368) 在com.google.cloud.dataflow.sdk.Pipeline.applyTransform(Pipeline.java:275) 在com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner.apply(DataflowPipelineRunner.java:463) 在com.google.cloud.dataflow.sdk.runners.BlockingDataflowPipelineRunner.apply(BlockingDataflowPipelineRunner.java:169) 在com.google.cloud.dataflow.sdk.Pipeline.applyInternal(Pipeline.java:368) 在com.google.cloud.dataflow.sdk.Pipeline.applyTransform(Pipeline.java:291) 在com.google.cloud.dataflow.sdk.values.PCollection.apply(PCollection.java:174) 在com.mcd.de.tlogdataflow.StarterPipeline.main(StarterPipeline.java:93)
感谢任何帮助。
谢谢&amp; BR 菲利普
答案 0 :(得分:0)
此崩溃是由Dataflow Java SDK(specifically, this line)中的一个错误引起的,该错误也存在于Apache Beam(孵化)Java SDK中。
必须始终覆盖方法Sink.WriterOperation#getWriterResultCoder()
,但我们无法将其标记为abstract
。它在Beam中固定,但在Dataflow SDK中没有变化。您应该覆盖此方法并返回适当的编码器。
你有一些选择来提出编码器:
VarLongCoder
或BigEndianLongCoder
long
代替UnzipResult
结构,即可按现状使用。SerializableCoder.of(UnzipResult.class)