Google数据流“未找到方案gs的文件系统”

时间:2018-12-13 11:45:51

标签: java exception google-cloud-platform google-cloud-storage google-cloud-dataflow

我正在尝试执行Google Dataflow应用程序,但是它抛出此异常

java.lang.IllegalArgumentException: No filesystem found for scheme gs
    at org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:459)
    at org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:529)
    at org.apache.beam.sdk.io.FileBasedSink.convertToFileResourceIfPossible(FileBasedSink.java:213)
    at org.apache.beam.sdk.io.TextIO$TypedWrite.to(TextIO.java:700)
    at org.apache.beam.sdk.io.TextIO$Write.to(TextIO.java:1028)
    at br.com.sulamerica.mecsas.ExportacaoDadosPipeline.main(ExportacaoDadosPipeline.java:52)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282)
    at java.lang.Thread.run(Thread.java:748)

这是我的管道代码的一部分

Pipeline.create()
        .apply(PubsubIO.readStrings().fromSubscription(subscription))
        .apply(new KeyExportacaoDadosToEntityTransform())
        .apply(new ListKeyEmpresaSelecionadasTransform())
        .apply(ParDo.of(new DoFn<List<Entity>, String>() {
            @ProcessElement
            public void processElement(ProcessContext c){
                c.output(
                    c.element().stream()
                        .map(e-> e.getString("dscRazaoSocial"))
                        .collect(Collectors.joining("\r\n"))
                );
            }
        }))
        .apply(TextIO.write().to("gs://<my bucket>"))
        .getPipeline()
    .run();

这是用于执行我的管道的命令

mvn -Pdataflow-runner compile exec:java \
  -Dexec.mainClass=br.com.xpto.foo.ExportacaoDadosPipeline \
  -Dexec.args="--project=<projectID>\
  --stagingLocation=gs://dataflow-xpto/exportacao/staging \
  --output=gs://dataflow-xpto/exportacao/output \
  --runner=DataflowRunner"  

2 个答案:

答案 0 :(得分:1)

我正在解决同一问题。因此,如果您使用Maven生成可执行jar,则您的阴影插件应如下所示:

                        <transformers>
                            <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
                            <!-- add Main-Class to manifest file -->
                            <transformer
                                    implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                <mainClass>com.main.Application</mainClass>
                            </transformer>
                        </transformers>
                    </configuration>

答案 1 :(得分:0)

我最近在使用Gradle处理Apache Beam Java管道时遇到了这个问题。

应用gradle阴影插件'com.github.johnrengelman.shadow'解决此问题。

在此处粘贴我的build.gradle文件以供将来参考-

buildscript {
    repositories {
        maven {
           url "https://plugins.gradle.org/m2/"
        }
        jcenter()
    }
    dependencies {
        classpath 'com.github.jengelman.gradle.plugins:shadow:5.1.0'
    }
}


plugins {
    id 'java'
    id 'com.github.johnrengelman.shadow' version '5.1.0'
}


sourceCompatibility = 1.8


apply plugin: 'java'
apply plugin: 'com.github.johnrengelman.shadow'

repositories {
    mavenLocal()
    mavenCentral()
    jcenter()
    ivy {
        url 'http://dl.bintray.com/content/johnrengelman/gradle-plugins'
    }
}

dependencies {
// your dependencies here
}

jar {
    manifest {
        attributes "Main-Class": "your_main_class_wth_package"
    }

    from {
        configurations.compile.collect { it.isDirectory() ? it : zipTree(it) }
    }
}

您应该在IntelliJ构建的“阴影”选项下看到任务shadowJar。享受吧!