Question

在过去的一周中，我一直看到我的GAE Flexible Environment上的实例数量下降到0，并且没有新的实例出现。我对灵活环境的理解是，这不可能...（https://cloud.google.com/appengine/docs/the-appengine-environments）

我想知道是否还有其他人看到过这些问题，或者他们之前是否已经解决了问题。我的一个假设是，这可能是我的健康监控端点遇到的一个问题，但是当我查看代码时，还没有发现任何跳出来的问题。

直到上周，这对我来说一直不是问题，现在看来我每两天都要重新部署我的环境（无任何更改），以便“重置”实例。值得注意的是，在同一个App Engine项目下我有两个服务，两个服务都运行灵活版本。但是我似乎只对其中一种服务有这个问题（我称之为工作者服务）。

App Engine用户界面的屏幕截图：

Logs UI中的屏幕截图，显示了正在发送的SIGTERM：

PS-这与最近出现的最近Google Compute问题有关... https://news.ycombinator.com/item?id=18436187

编辑：为“工作者”服务添加yaml文件。请注意，我正在使用Honcho添加端点以通过Flask监视工作者服务的运行状况。我还添加了这些代码示例。

yaml文件

service: worker
runtime: python
threadsafe: yes
env: flex
entrypoint: honcho start -f /app/procfile worker monitor

runtime_config:
  python_version: 3

resources:
  cpu: 1
  memory_gb: 4
  disk_size_gb: 10

automatic_scaling:
  min_num_instances: 1
  max_num_instances: 20
  cool_down_period_sec: 120
  cpu_utilization:
    target_utilization: 0.7

Honcho的Procfile

default: gunicorn -b :$PORT main:app
worker: python tasks.py
monitor: python monitor.py /tmp/psq.pid

monitor.py

import os
import sys

from flask import Flask


# The app checks this file for the PID of the process to monitor.
PID_FILE = None


# Create app to handle health checks and monitor the queue worker. This will
# run alongside the worker, see procfile.
monitor_app = Flask(__name__)


@monitor_app.route('/_ah/health')
def health():
    """
    The health check reads the PID file created by tasks.py main and checks the proc
    filesystem to see if the worker is running.
    """
    if not os.path.exists(PID_FILE):
        return 'Worker pid not found', 503

    with open(PID_FILE, 'r') as pidfile:
        pid = pidfile.read()

    if not os.path.exists('/proc/{}'.format(pid)):
        return 'Worker not running', 503

    return 'healthy', 200


@monitor_app.route('/')
def index():
    return health()


if __name__ == '__main__':
    PID_FILE = sys.argv[1]
    monitor_app.run('0.0.0.0', 8080)

实例为0的Google App Engine灵活环境

0 个答案: