我有一个由Apache2提供的Django Web应用程序,其mod_wsgi位于Docker容器中,该容器在Google Cloud Platform的Kubernetes集群上运行,并受身份识别代理保护。一切工作正常,但是我想为所有请求发送GCP Stackdriver跟踪,而不为项目中的每个视图编写一个。我发现了使用Opencensus来处理此问题的中间件。我经历了this documentation,并且可以通过指定StackdriverExporter
并将project_id
参数作为Google Cloud Platform {{1}来手动生成在我的项目中导出到Stackdriver Trace的跟踪。 }用于我的项目。
现在要自动执行所有请求,我按照说明设置了中间件。在settings.py中,我将模块添加到Project Number
,INSTALLED_APPS
,并设置了MIDDLEWARE
选项字典。我还添加了OPENCENSUS_TRACE
。这对于默认的导出器'opencensus.trace.exporters.print_exporter.PrintExporter'非常有用,因为我可以在Apache2 Web服务器日志中看到跟踪和跨度信息,包括跟踪ID和所有详细信息。但是,我想将它们发送到我的Stackdriver Trace处理器进行分析。
我尝试将OPENCENSUS_TRACE_PARAMS
参数设置为EXPORTER
,只要您提供项目编号,当从shell手动运行时,该参数就可以工作。
设置为使用opencensus.trace.exporters.stackdriver_exporter.StackdriverExporter
时,网页将不响应负载,运行状况检查开始失败,最终网页返回502错误,并指出我应该在30天内重试秒(我相信Identity-Aware代理一旦检测到运行状况检查失败,就会生成此错误),但是服务器不会生成任何错误,并且Apache2的访问日志或错误均不会。
settings.py中还有另一本名为StackdriverExporter
的字典,我想确定导出器应该使用哪个项目号。该示例将OPENCENSUS_TRACE_PARAMS
设置为GCP_EXPORTER_PROJECT
,并将None
设置为SERVICE_NAME
。
我需要设置哪些选项以使导出程序发送回Stackdriver而不是打印到日志?您是否知道我该如何设置?
settings.py
'my_service'
这是一个Apache2日志(设置为使用MIDDLEWARE = (
...
'opencensus.trace.ext.django.middleware.OpencensusMiddleware',
)
INSTALLED_APPS = (
...
'opencensus.trace.ext.django',
)
OPENCENSUS_TRACE = {
'SAMPLER': 'opencensus.trace.samplers.probability.ProbabilitySampler',
'EXPORTER': 'opencensus.trace.exporters.stackdriver_exporter.StackdriverExporter', # This one just makes the server hang with no response or error and kills the health check.
'PROPAGATOR': 'opencensus.trace.propagation.google_cloud_format.GoogleCloudFormatPropagator',
# 'EXPORTER': 'opencensus.trace.exporters.print_exporter.PrintExporter', # This one works to print the Trace and Span with IDs and details in the logs.
}
OPENCENSUS_TRACE_PARAMS = {
'BLACKLIST_PATHS': ['/health'],
'GCP_EXPORTER_PROJECT': 'my_project_number', # Should this be None like the example, or Project ID, or Project Number?
'SAMPLING_RATE': 0.5,
'SERVICE_NAME': 'my_service', # Not sure if this is my app name or some other service name.
'ZIPKIN_EXPORTER_HOST_NAME': 'localhost', # Are the following even necessary, or are they causing a failure that is not detected by Apache2?
'ZIPKIN_EXPORTER_PORT': 9411,
'ZIPKIN_EXPORTER_PROTOCOL': 'http',
'JAEGER_EXPORTER_HOST_NAME': None,
'JAEGER_EXPORTER_PORT': None,
'JAEGER_EXPORTER_AGENT_HOST_NAME': 'localhost',
'JAEGER_EXPORTER_AGENT_PORT': 6831
}
时的示例(我为可读性指定了格式):
PrintExporter
在此先感谢您提供任何提示,帮助或故障排除建议!
编辑UTC 2019-02-08 6:56:
我在中间件中发现了这个
[Fri Feb 08 09:00:32.427575 2019]
[wsgi:error]
[pid 1097:tid 139801302882048]
[client 10.48.0.1:43988]
[SpanData(
name='services.views.my_view',
context=SpanContext(
trace_id=e882f23e49e34fc09df621867d753532,
span_id=None,
trace_options=TraceOptions(enabled=True),
tracestate=None
),
span_id='bcbe7b96906a482a',
parent_span_id=None,
attributes={
'http.status_code': '200',
'http.method': 'GET',
'http.url': '/',
'django.user.name': ''
},
start_time='2019-02-08T17:00:29.845733Z',
end_time='2019-02-08T17:00:32.427455Z',
child_span_count=0,
stack_trace=None,
time_events=[],
links=[],
status=None,
same_process_as_parent_span=None,
span_kind=1
)]
导出器现在命名为# Initialize the exporter
transport = convert_to_import(settings.params.get(TRANSPORT))
if self._exporter.__name__ == 'GoogleCloudExporter':
_project_id = settings.params.get(GCP_EXPORTER_PROJECT, None)
self.exporter = self._exporter(
project_id=_project_id,
transport=transport)
elif self._exporter.__name__ == 'ZipkinExporter':
_service_name = self._get_service_name(settings.params)
_zipkin_host_name = settings.params.get(
ZIPKIN_EXPORTER_HOST_NAME, 'localhost')
_zipkin_port = settings.params.get(
ZIPKIN_EXPORTER_PORT, 9411)
_zipkin_protocol = settings.params.get(
ZIPKIN_EXPORTER_PROTOCOL, 'http')
self.exporter = self._exporter(
service_name=_service_name,
host_name=_zipkin_host_name,
port=_zipkin_port,
protocol=_zipkin_protocol,
transport=transport)
elif self._exporter.__name__ == 'TraceExporter':
_service_name = self._get_service_name(settings.params)
_endpoint = settings.params.get(
OCAGENT_TRACE_EXPORTER_ENDPOINT, None)
self.exporter = self._exporter(
service_name=_service_name,
endpoint=_endpoint,
transport=transport)
elif self._exporter.__name__ == 'JaegerExporter':
_service_name = self._get_service_name(settings.params)
self.exporter = self._exporter(
service_name=_service_name,
transport=transport)
else:
self.exporter = self._exporter(transport=transport)
,而不是StackdriverExporter
。我在应用程序GoogleCloudExporter
中建立了一个继承GoogleCloudExporter
的类,并更新了settings.py以使用StackdriverExporter
,但它似乎没有用,我想知道是否有引用这些旧命名方案的其他代码,可能用于传输。我正在寻找源代码的线索...这至少告诉我,我可以摆脱ZIPKIN和JAEGER参数选项,因为这是由GoogleCloudExporter
参数决定的。
编辑UTC时间2019-02-08 11:58:
我报废了Apache2来解决问题,只是将docker映像设置为使用Django内置的网络服务器EXPORTER
,它可以正常工作!当我转到该站点时,它会为每个请求将跟踪记录写入Stackdriver Trace,Span名称是正在执行的模块和方法。
以某种方式不允许Apache2发送这些消息,但是当以root用户身份运行时,我可以从shell中发送消息。我向问题中添加了Apache2和mod-wsgi标记,因为我有一种有趣的感觉,这与在Apache2和mod-WSGI中派生子进程有关。是因为apache2的子进程被沙盒化而无法创建子进程,还是这是权限?似乎很奇怪,因为据我所知,它只是调用python模块,没有调用外部系统OS二进制文件。任何其他想法将不胜感激!
答案 0 :(得分:0)
我在使用带有gevent的gunicorn作为工人阶级时遇到了这个问题。要解决并使云痕迹正常运行,解决方案是像这样修补猴子grpc
from gevent import monkey
monkey.patch_all()
import grpc.experimental.gevent as grpc_gevent
grpc_gevent.init_gevent()
请参见https://github.com/grpc/grpc/issues/4629#issuecomment-376962677