我正在服务从Kafka消费并写入BigQuery。所述服务将在Google Container Engine上的Kubernetes中运行。
在大多数情况下,此过程有效,但在尝试使用Google进行身份验证时,大约20%的已启动容器会失败。错误讯息:
java.net.UnknownHostException: accounts.google.com
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:668)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264)
at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1138)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1032)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1316)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1291)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:250)
at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:77)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981)
at com.google.api.client.auth.oauth2.TokenRequest.executeUnparsed(TokenRequest.java:283)
at com.google.api.client.auth.oauth2.TokenRequest.execute(TokenRequest.java:307)
at com.google.api.client.googleapis.auth.oauth2.GoogleCredential.executeRefreshToken(GoogleCredential.java:384)
at com.google.api.client.auth.oauth2.Credential.refreshToken(Credential.java:489)
at com.google.api.client.auth.oauth2.Credential.intercept(Credential.java:217)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:868)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
我试图暂停几秒钟并重试(循环播放3-4x),但这似乎没有任何区别。
我不认为这是我的容器的问题,因为70-80%的容器正常工作。只有这几个似乎无法正确初始化或进入一些不可恢复的状态。
有什么事要做吗?从内部'失败'容器的一些方法,所以kubernetes将重新启动它?
要“失败”kubernetes容器(至少在Java中)只是从进程中返回非零值。对于Java System.exit(2);
。
问题似乎与个别kubernetes服务器主机有关,而不是与necc容器有关。我发现this引用似乎与它有关但我无法弄清楚如何在这些系统上重新启动docker进程。
答案 0 :(得分:1)
我在本地主机上也遇到过这样的错误,并且通常再次运行gcloud container clusters get-credentials
或重新启动修复问题。
我仍然没有弄清楚根本原因是什么,但也许您可以尝试重新启动节点。
失败的pod,它们都在同一个节点上运行吗?