我们正在尝试通过用Java编写的数据流作业连接到数据存储服务,但是由于数据存储SDK错误,我们面临着一些问题。
我们正在使用eclipse在本地计算机上使用directrunner运行作业。
代码:
import java.net.SocketTimeoutException;
import org.apache.beam.sdk.options.PipelineOptions;
import com.google.cloud.datastore.Datastore;
import com.google.cloud.datastore.DatastoreOptions;
public class StarterPipeline {
public interface StarterPipelineOption extends PipelineOptions {
}
@SuppressWarnings("serial")
public static void main(String[] args) throws SocketTimeoutException {
Datastore datastore = DatastoreOptions.getDefaultInstance().getService();
}
}
错误:
Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Illegal character in path at index 45: https://datastore.googleapis.com/v1/projects/<!DOCTYPE html>
at com.google.datastore.v1.client.DatastoreFactory.validateUrl(DatastoreFactory.java:122)
at com.google.datastore.v1.client.DatastoreFactory.buildProjectEndpoint(DatastoreFactory.java:108)
at com.google.datastore.v1.client.DatastoreFactory.newRemoteRpc(DatastoreFactory.java:115)
at com.google.datastore.v1.client.DatastoreFactory.create(DatastoreFactory.java:65)
at com.google.cloud.datastore.spi.v1.HttpDatastoreRpc.<init>(HttpDatastoreRpc.java:71)
at com.google.cloud.datastore.DatastoreOptions$DefaultDatastoreRpcFactory.create(DatastoreOptions.java:61)
at com.google.cloud.datastore.DatastoreOptions$DefaultDatastoreRpcFactory.create(DatastoreOptions.java:55)
at com.google.cloud.ServiceOptions.getRpc(ServiceOptions.java:512)
at com.google.cloud.datastore.DatastoreOptions.getDatastoreRpcV1(DatastoreOptions.java:179)
at com.google.cloud.datastore.DatastoreImpl.<init>(DatastoreImpl.java:56)
at com.google.cloud.datastore.DatastoreOptions$DefaultDatastoreFactory.create(DatastoreOptions.java:51)
at com.google.cloud.datastore.DatastoreOptions$DefaultDatastoreFactory.create(DatastoreOptions.java:45)
at com.google.cloud.ServiceOptions.getService(ServiceOptions.java:499)
at purplle.datapipeline.StarterPipeline.main(StarterPipeline.java:234)
Caused by: java.net.URISyntaxException: Illegal character in path at index 45: https://datastore.googleapis.com/v1/projects/<!DOCTYPE html>
at java.net.URI$Parser.fail(Unknown Source)
at java.net.URI$Parser.checkChars(Unknown Source)
at java.net.URI$Parser.parseHierarchical(Unknown Source)
at java.net.URI$Parser.parse(Unknown Source)
at java.net.URI.<init>(Unknown Source)
at com.google.datastore.v1.client.DatastoreFactory.validateUrl(DatastoreFactory.java:120)
... 13 more
我们使用的SDK版本低于我认为是最新的版本。
<!-- https://mvnrepository.com/artifact/com.google.cloud/google-cloud-storage -->
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-storage</artifactId>
<version>1.37.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.google.cloud/google-cloud-datastore -->
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-datastore</artifactId>
<version>1.37.1</version>
</dependency>
<dependency>
<groupId>com.google.cloud.dataflow</groupId>
<artifactId>google-cloud-dataflow-java-sdk-all</artifactId>
<version>2.5.0</version>
</dependency>
遍历google寻求解决方案时,我们在下面的线程中找到该线程,指出该问题已在2月修复,但我正面临此问题。
https://github.com/GoogleCloudPlatform/google-cloud-java/issues/2440
答案 0 :(得分:0)
曾经联系过Google云支持,他们说要在开发过程中本地运行管道,我们必须通过环境变量手动提供项目ID。
以下是Google支持小组的回复。
您收到的错误来自1,当“未提供服务器或凭据”时引发。就您而言,您在构建客户端时没有指定凭据
数据存储区数据存储区= DatastoreOptions.getDefaultInstance()。getService();
客户端库尝试通过环境变量GOOGLE_APPLICATION_CREDENTIALS查找凭据,但由于该作业未在Compute Engine,Kubernetes Engine,App Engine或Cloud Functions上运行而失败。 3和4-在Compute / App Engine中运行]。我相信您的代码应与DataflowRunner一起使用。请确认是否是这种情况。
要在本地运行代码,可以手动创建和获取服务帐户凭据5,有关详细示例,请检查4或6。
因此,我在连接到数据存储区API时下载了服务帐户凭据,并且可以正常工作!