如何使用API​​在GCP数据流中检索当前的工作人数

时间:2019-01-23 16:22:30

标签: google-cloud-platform google-cloud-dataflow google-apis-explorer

有人知道在GCP Dataflow中运行的现役是否有可能使当前工人计数?

我无法使用Google API提供的功能。

我能得到的一件事是CurrentVcpuCount,但这不是我所需要的。

谢谢!

2 个答案:

答案 0 :(得分:2)

数据流作业中的当前工作人数显示在消息日志的autoscaling下。例如,当我在Cloud Shell中显示作业日志时,我以快速作业为例,并收到以下消息:

INFO:root:2019-01-28T16:42:33.173Z: JOB_MESSAGE_DETAILED: Autoscaling: Raised the number of workers to 0 based on the rate of progress in the currently running step(s).
INFO:root:2019-01-28T16:43:02.166Z: JOB_MESSAGE_DETAILED: Autoscaling: Raised the number of workers to 1 based on the rate of progress in the currently running step(s).
INFO:root:2019-01-28T16:43:05.385Z: JOB_MESSAGE_DETAILED: Workers have started successfully.
INFO:root:2019-01-28T16:43:05.433Z: JOB_MESSAGE_DETAILED: Workers have started successfully.

现在,您可以在数据流API中使用projects.jobs.messages.list方法,并将minimumImportance参数设置为JOB_MESSAGE_BASIC,以查询这些消息。

您将收到类似于以下内容的回复:

...
"autoscalingEvents": [
    {...} //other events
    {

      "currentNumWorkers": "1",
      "eventType": "CURRENT_NUM_WORKERS_CHANGED",
      "description": {
          "messageText": "(fcfef6769cff802b): Worker pool started.",
          "messageKey": "POOL_STARTUP_COMPLETED"
      },
      "time": "2019-01-28T16:43:02.130129051Z",
      "workerPool": "Regular"
    },

要扩展此范围,您可以创建一个python脚本来解析响应,并且仅从列表currentNumWorkers中的最后一个元素中获取参数autoscalingEvents,以了解最后一个(因此,当前)工作中的工人人数。

请注意,如果不存在此参数,则表示工作程序数为零。

修改

我做了一个快速的python脚本,使用上面提到的API从消息日志中检索了当前的工作人数:

from google.oauth2 import service_account
import googleapiclient.discovery


credentials = service_account.Credentials.from_service_account_file(
    filename='PATH-TO-SERVICE-ACCOUNT-KEY/key.json',
    scopes=['https://www.googleapis.com/auth/cloud-platform'])
service = googleapiclient.discovery.build(
            'dataflow', 'v1b3', credentials=credentials)




project_id="MY-PROJECT-ID"
job_id="DATAFLOW-JOB-ID"

messages=service.projects().jobs().messages().list(
            projectId=project_id,
            jobId=job_id
        ).execute()

try:
    print("Current number of workers is "+messages['autoscalingEvents'][-1]['currentNumWorkers'])
except:
    print("Current number of workers is 0")

一些注意事项:

  • 作用域是您要引用的服务帐户密钥(在from_service_account_file函数中)所需的权限,以便对API进行调用。需要此行来验证API。您可以使用this list中的任何一个,为了简便起见,我只使用了具有project/owner权限的服务帐户密钥。

  • 如果您想了解有关Python API客户端库的更多信息,请选中this documentationthis samples

答案 1 :(得分:-2)

<script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script>
<script>
     (adsbygoogle = window.adsbygoogle || []).push({
          google_ad_client: "ca-pub-5513132861824326",
          enable_page_level_ads: true
     });
</script>