我已经使用DirectRunner和DataflowRunner使用以下代码测试了流输入中的sideinput:
public class Testsideinput {
private static final Logger LOG = LoggerFactory.getLogger(Testsideinput.class);
static class RefreshCache extends DoFn<Long, String> {
private static final long serialVersionUID = 1;
private static final Random RANDOM = new Random();
@ProcessElement
public void processElement(ProcessContext c) {
c.output("A"+c.element());
c.output("B"+c.element());
c.output("C"+c.element());
c.output("D"+c.element());
c.output("E"+c.element());
c.output("F"+c.element());
}
}
public static void main(String[] args) {
PipelineOptions options = PipelineOptionsFactory.fromArgs(args).create();
Pipeline pipeline = Pipeline.create(options);
final PCollectionView<List<String>> sideInput2 =
pipeline.apply("TextIO", TextIO.read().from("<Put your gs://>))
.apply("viewTags", View.asList());
final PCollectionView<List<String>> sideInput =
pipeline.apply("GenerateSequence",
GenerateSequence
.from(0)
.withRate(1, Duration.standardSeconds(1)))
.apply("Window GenerateSequence",
Window.into(FixedWindows.of(Duration.standardSeconds(5))))
.apply("Counts", Combine.globally(Sum.ofLongs()).withoutDefaults())
.apply("RefreshCache", ParDo.of(new RefreshCache()))
.apply("viewTags", View.asList());
final PubsubIO.Read<PubsubMessage> pubsubRead =
PubsubIO.readMessages()
.withIdAttribute("id")
.withTimestampAttribute("ts")
.fromTopic("<put your topic>");
// PCollection<KV<String,Long>> taxi =;
PCollection<String> taxi =
pipeline.apply("Read from", pubsubRead)
.apply("Window Fixed",
Window.into(FixedWindows.of(Duration.standardSeconds(15))))
.apply(MapElements.via(new PubSubToTableRow()))
.apply("key rides by rideid",
MapElements
.into(TypeDescriptors
.kvs(TypeDescriptors.strings(),
TypeDescriptor.of(TableRow.class)))
.via(ride -> KV.of(ride.get("ride_id").toString(), ride)))
.apply("Count Per Element", Count.perKey())
.apply(
ParDo.of(new DoFn<KV<String,Long>, String>() {
@ProcessElement
public void processElement(
@Element KV<String,Long> value,
OutputReceiver<String> out, ProcessContext c) {
// In our DoFn, access the side input.
List<String> sideinput = c.sideInput(sideInput);
List<String> sideinput2 = c.sideInput(sideInput2);
LOG.info("sideinput" + sideinput.toString());
LOG.info("sideinput2 " + sideinput2.toString());
LOG.info("value " + value);
out.output("test");
}
}).withSideInputs(sideInput,sideInput2));
pipeline.run();
}
我在DirectRunner上具有边输入(列表和映射)的所有值,但在DataflowRunner中没有任何价值(View.CreatePCollectionView/ParDo(StreamingPCollectionViewWriter)
步骤没有输出)
您有解决此问题的想法吗?