无法使用XmlSource读取存储在GCS Bucket中的XML文件

时间:2017-08-30 10:49:52

标签: google-cloud-dataflow apache-beam

之前有没有人尝试过此代码?

XmlSource<String> source = XmlSource.<String>from("gs://balajee_test/sample_3.xml")
              .withRootElement("book")
              .withRecordElement("author")
              .withRecordElement("title")
              .withRecordElement("genre")
              .withRecordElement("price")
              .withRecordElement("description")
              .withRecordClass(XMLFormatter.class);

PCollection<String> output = p.apply(Read.from(source));

https://beam.apache.org/documentation/sdks/javadoc/0.4.0/org/apache/beam/sdk/io/XmlSource.html

  

org.apache.beam.sdk.io.xml.XmlSource

希望我正在使用正确的'XmlSource'类,但仍然无法解决方法'from(“gs://balajee_test/sample_3.xml”)'的依赖关系,并且会出现相同的编译错误。错误消息是:

  

对于XmlSource类型

,未定义来自(String)的方法

这个问题可能太傻了,但我真的需要解决它才能读取存储在GCS Bucket中的XML文件。

1 个答案:

答案 0 :(得分:0)

From comments, seem that the SDK used is 2.0 which has a new way to define a read from XML. Check the new documentation for how to read.

SDK IO documentation (for 2.0.0) can be found here: beam.apache.org/documentation/sdks/javadoc/2.0.0