我们正在尝试将我们的应用程序连接到stackdriver profiler,但由于权限问题而失败。
我们正在GKE上运行Java应用。
下面是Dockerfile
FROM gcr.io/google-appengine/jetty
RUN mkdir -p /opt/cprof && \
wget -q -O- https://storage.googleapis.com/cloud-profiler/java/latest/profiler_java_agent.tar.gz \
| tar xzv -C /opt/cprof
RUN java \
-agentpath:/opt/cprof/profiler_java_agent.so=-cprof_service=gke,-logtostderr,-minloglevel=0,-cprof_service_version=1.0.0,-cprof_gce_metadata_server_retry_sleep_sec=10,-cprof_gce_metadata_server_retry_count=12 \
-jar "$JETTY_HOME/start.jar" --create-startd --add-to-start=gcloud,http2c --approve-all-licenses
ENV JETTY_ARGS -Djava.util.logging.config.file=WEB-INF/flex.logging.properties
ENV DBG_ENABLE true
ADD . $APP_DESTINATION_EXPLODED_WAR
ENV JAVA_USER_OPTS -XX:-OmitStackTraceInFastThrow -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseStringDeduplication -XX:+PrintStringDeduplicationStatistics -Xloggc:/tmp/logs.gc
已使用以下命令创建了集群:-
gcloud beta container clusters create $cluster_name --machine-type=n1-highmem-8 --project=$project_id --zone=us-central1-c --scopes="cloud-platform" --num-nodes=2
我们按照profiling-java中的步骤配置了Dockerfile
,但是事件探查器失败并显示以下消息。
Failed to create profile, will retry: 7 (The caller does not have permission)
在下面查看完整的部署日志:-
12:53:42 starting build "18a3a03c-6dc8-4683-8fae-92ec22aa84a8"
12:53:42
12:53:42 FETCHSOURCE
12:53:42 Fetching storage object: gs://tradeos-test1_cloudbuild/source/1562838709.77-78c301e261714cd2a46391b235d5edc5.tgz#1562838738548166
12:53:42 Copying gs://tradeos-test1_cloudbuild/source/1562838709.77-78c301e261714cd2a46391b235d5edc5.tgz#1562838738548166...
12:53:42 / [0 files][ 0.0 B/239.8 MiB]
-
- [0 files][ 56.0 MiB/239.8 MiB]
\
|
| [0 files][165.0 MiB/239.8 MiB]
/
/ [1 files][239.8 MiB/239.8 MiB]
12:53:42 Operation completed over 1 objects/239.8 MiB.
12:53:42 BUILD
12:53:42 Already have image (with digest): gcr.io/cloud-builders/docker
12:53:42 Sending build context to Docker daemon 396.6MB
12:53:42 Step 1/8 : FROM gcr.io/google-appengine/jetty
12:53:42 latest: Pulling from google-appengine/jetty
12:53:42 Digest: sha256:7e37b8561b2f25660d1aa492dea4f09a6121fe7f8b7f6b2e9f8c65e1cf33328e
12:53:42 Status: Downloaded newer image for gcr.io/google-appengine/jetty:latest
12:53:42 ---> dac4353b3a0c
12:53:42 Step 2/8 : RUN mkdir -p /opt/cprof && wget -q -O- https://storage.googleapis.com/cloud-profiler/java/latest/profiler_java_agent.tar.gz | tar xzv -C /opt/cprof
12:53:42 ---> Running in 3799f9af9c33
12:53:42 NOTICES
12:53:42 profiler_java_agent.so
12:53:42 Removing intermediate container 3799f9af9c33
12:53:42 ---> 8cd480829757
12:53:42 Step 3/8 : RUN java -agentpath:/opt/cprof/profiler_java_agent.so=-cprof_service=gke,-logtostderr,-minloglevel=0,-cprof_service_version=1.0.0,-cprof_gce_metadata_server_retry_sleep_sec=10,-cprof_gce_metadata_server_retry_count=12 -jar "$JETTY_HOME/start.jar" --create-startd --add-to-start=gcloud,http2c --approve-all-licenses
12:53:42 ---> Running in bad2457cd65c
12:53:42 [91mI0711 09:52:57.100730 7 entry.cc:268] Profiler agent loaded
12:53:42 [0m[91mI0711 09:52:57.105134 7 entry.cc:154] Prepare JVMTI
12:53:42 [0m[91mI0711 09:52:57.301863 7 entry.cc:108] On VM init
12:53:42 [0m[91mI0711 09:52:57.304172 7 cloud_env.cc:136] Project ID is not set via flag or environment, will get from the metadata server
12:53:42 [0m[91mI0711 09:52:57.304981 7 throttler_api.cc:269] Will use profiler service cloudprofiler.googleapis.com to create and upload profiles
12:53:42 [0m[91mI0711 09:52:57.315691 15 throttler_api.cc:202] Initialized deployment: project_id=tradeos-test1, service=gke, service_version=1.0.0, zone_name=us-central1-f
12:53:42 [0m[91mI0711 09:52:57.320428 15 throttler_api.cc:302] Creating a new profile via profiler service
12:53:42 [0m[91mW0711 09:52:57.539041 15 throttler_api.cc:382] Failed to create profile, will retry: 7 (The caller does not have permission)
12:53:42 [0m[91mINFO [0m[91m: [0m[91mAll Licenses Approved via Command Line Option[0m[91m
12:53:42 [0m[91mINFO [0m[91m: [0m[91mgcloud initialized in ${jetty.base}/start.d/gcloud.ini[0m[91m
12:53:42 [0m[91mINFO [0m[91m: [0m[91mhttp2c initialized in ${jetty.base}/start.d/http2c.ini[0m[91m
12:53:42 [0m[91mINFO [0m[91m: [0m[91mBase directory was modified[0m[91m
12:53:42 [0m[91mI0711 09:52:58.506006 7 entry.cc:143] On VM death
12:53:42 [0m[91mI0711 09:53:25.504830 15 throttler_api.cc:302] Creating a new profile via profiler service
12:53:42 I0711 09:53:25.505025 15 worker.cc:177] Exiting the profiling loop
12:53:42 [0mRemoving intermediate container bad2457cd65c
12:53:42 ---> e9a969345ed1
12:53:42 Step 4/8 : ENV JETTY_ARGS -Djava.util.logging.config.file=WEB-INF/flex.logging.properties
12:53:42 ---> Running in d3e4b6456694
12:53:42 Removing intermediate container d3e4b6456694
12:53:42 ---> 52deb5efe19d
12:53:42 Step 5/8 : ENV DBG_ENABLE true
12:53:42 ---> Running in 6bd2b94ae0fc
12:53:42 Removing intermediate container 6bd2b94ae0fc
12:53:42 ---> 9800d1e2d074
12:53:42 Step 6/8 : ADD . $APP_DESTINATION_EXPLODED_WAR
12:53:42 ---> e1d7881f4558
12:53:42 Step 7/8 : ENV JAVA_USER_OPTS -XX:-OmitStackTraceInFastThrow -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseStringDeduplication -XX:+PrintStringDeduplicationStatistics -Xloggc:/tmp/logs.gc
12:53:42 ---> Running in 81247ccde74f
12:53:42 Removing intermediate container 81247ccde74f
12:53:42 ---> d66dbce64869
12:53:42 Step 8/8 : ENV GCLOUD_PROJECT tradeos-test1
12:54:07 ---> Running in af00f02783b7
12:54:07 Removing intermediate container af00f02783b7
12:54:07 ---> 77826e33ef3c
12:54:07 Successfully built 77826e33ef3c
12:54:07 Successfully tagged gcr.io/tradeos-test1/ram-image:1
12:54:07 PUSH
12:54:07 Pushing gcr.io/tradeos-test1/ram-image:1
12:54:07 The push refers to repository [gcr.io/tradeos-test1/ram-image]
我们用于部署的服务帐户具有roles/cloudprofiler.agent
,但仍然失败。知道我们缺少什么权限吗?
更新:
GKE节点使用the default compute engine service account,我向其中添加了roles/cloudprofiler.agent
,但仍然存在相同的错误。
答案 0 :(得分:1)
我遇到了同样的问题,在我的情况下,这个特定的GKE服务拥有自己的服务帐户,该帐户具有访问BigQuery数据集的权限,并且将秘密放置在容器中的/var/secrets/google/key.json
上,以便应用程序BiqQuery lib可以找到它在启动期间以及GOOGLE_APPLICATION_CREDENTIALS环境变量中,这里的问题可能是探查器代理看上去也在检查在此变量中设置的路径,启动时以及是否找到它-它将开始将其用于所有请求到cloudprofiler.googleapis.com。
可以在此处找到一些相关信息-https://cloud.google.com/profiler/docs/profiling-external#using_service_accounts
解决方案之一可能是为该服务帐户提供访问Google Profiler服务的其他权限。
如果不是这种情况,则应检查GKE节点上附加的默认服务帐户的权限,有关详细信息,请点击此处-https://cloud.google.com/kubernetes-engine/docs/how-to/hardening-your-cluster
Each GKE node has an IAM Service Account associated with it. By default, nodes are given the Compute Engine default service account, which you can find by navigating to the IAM section of the Cloud Console.
希望这可以帮助某人。