Google数据流-使用数据存储区选项连接到数据存储区会出错

时间:2018-08-02 12:30:26

标签: google-cloud-datastore google-cloud-dataflow

我们正在尝试通过用Java编写的数据流作业连接到数据存储服务,但是由于数据存储SDK错误,我们面临着一些问题。

我们正在使用eclipse在本地计算机上使用directrunner运行作业。

代码:

import java.net.SocketTimeoutException;

import org.apache.beam.sdk.options.PipelineOptions;

import com.google.cloud.datastore.Datastore;
import com.google.cloud.datastore.DatastoreOptions;

public class StarterPipeline {

    public interface StarterPipelineOption extends PipelineOptions {

    }

    @SuppressWarnings("serial")
    public static void main(String[] args) throws SocketTimeoutException {

        Datastore datastore = DatastoreOptions.getDefaultInstance().getService();

    }
}

错误:

Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Illegal character in path at index 45: https://datastore.googleapis.com/v1/projects/<!DOCTYPE html>
at com.google.datastore.v1.client.DatastoreFactory.validateUrl(DatastoreFactory.java:122)
at com.google.datastore.v1.client.DatastoreFactory.buildProjectEndpoint(DatastoreFactory.java:108)
at com.google.datastore.v1.client.DatastoreFactory.newRemoteRpc(DatastoreFactory.java:115)
at com.google.datastore.v1.client.DatastoreFactory.create(DatastoreFactory.java:65)
at com.google.cloud.datastore.spi.v1.HttpDatastoreRpc.<init>(HttpDatastoreRpc.java:71)
at com.google.cloud.datastore.DatastoreOptions$DefaultDatastoreRpcFactory.create(DatastoreOptions.java:61)
at com.google.cloud.datastore.DatastoreOptions$DefaultDatastoreRpcFactory.create(DatastoreOptions.java:55)
at com.google.cloud.ServiceOptions.getRpc(ServiceOptions.java:512)
at com.google.cloud.datastore.DatastoreOptions.getDatastoreRpcV1(DatastoreOptions.java:179)
at com.google.cloud.datastore.DatastoreImpl.<init>(DatastoreImpl.java:56)
at com.google.cloud.datastore.DatastoreOptions$DefaultDatastoreFactory.create(DatastoreOptions.java:51)
at com.google.cloud.datastore.DatastoreOptions$DefaultDatastoreFactory.create(DatastoreOptions.java:45)
at com.google.cloud.ServiceOptions.getService(ServiceOptions.java:499)
at purplle.datapipeline.StarterPipeline.main(StarterPipeline.java:234)
Caused by: java.net.URISyntaxException: Illegal character in path at index 45: https://datastore.googleapis.com/v1/projects/<!DOCTYPE html>
at java.net.URI$Parser.fail(Unknown Source)
at java.net.URI$Parser.checkChars(Unknown Source)
at java.net.URI$Parser.parseHierarchical(Unknown Source)
at java.net.URI$Parser.parse(Unknown Source)
at java.net.URI.<init>(Unknown Source)
at com.google.datastore.v1.client.DatastoreFactory.validateUrl(DatastoreFactory.java:120)
... 13 more

我们使用的SDK版本低于我认为是最新的版本。

<!-- https://mvnrepository.com/artifact/com.google.cloud/google-cloud-storage -->
<dependency>
    <groupId>com.google.cloud</groupId>
    <artifactId>google-cloud-storage</artifactId>
    <version>1.37.1</version>
</dependency>

<!-- https://mvnrepository.com/artifact/com.google.cloud/google-cloud-datastore -->
<dependency>
    <groupId>com.google.cloud</groupId>
    <artifactId>google-cloud-datastore</artifactId>
    <version>1.37.1</version>
</dependency>

<dependency>
    <groupId>com.google.cloud.dataflow</groupId>
    <artifactId>google-cloud-dataflow-java-sdk-all</artifactId>
    <version>2.5.0</version>
</dependency>

遍历google寻求解决方案时,我们在下面的线程中找到该线程,指出该问题已在2月修复,但我正面临此问题。

https://github.com/GoogleCloudPlatform/google-cloud-java/issues/2440

1 个答案:

答案 0 :(得分:0)

曾经联系过Google云支持,他们说要在开发过程中本地运行管道,我们必须通过环境变量手动提供项目ID。

以下是Google支持小组的回复。

您收到的错误来自1,当“未提供服务器或凭据”时引发。就您而言,您在构建客户端时没有指定凭据

数据存储区数据存储区= DatastoreOptions.getDefaultInstance()。getService();

客户端库尝试通过环境变量GOOGLE_APPLICATION_CREDENTIALS查找凭据,但由于该作业未在Compute Engine,Kubernetes Engine,App Engine或Cloud Functions上运行而失败。 34-在Compute / App Engine中运行]。我相信您的代码应与DataflowRunner一起使用。请确认是否是这种情况。

要在本地运行代码,可以手动创建和获取服务帐户凭据5,有关详细示例,请检查46


因此,我在连接到数据存储区API时下载了服务帐户凭据,并且可以正常工作!