计算机 - CentOS 7.2 或 Ubuntu 14.04 / 16.xx
Telegraf 版本:1.0.1
Python 版本:2.7.5
Telegraf支持名为exec的INPUT插件。首先,请参阅 README doc中的示例2 。我不能使用JSON格式,因为它只使用指标的数字值。根据文档:
If using JSON, only numeric values are parsed and turned into floats. Booleans and strings will be ignored.
所以,这个想法很简单,你在exec插件部分指定一个脚本,它应该吐出一些有意义的信息(以 JSON - 或 - 涌入数据格式< strong>在我的情况下,因为我有一些包含非数字值的指标),您希望在酷炫的仪表板中捕获/显示某些内容,例如此处显示的 Wavefront Dashboard : :
基本上,人们可以使用这些指标,标签,来自这些指标来源的来源,找出有关内存,CPU,磁盘,网络,其他有意义信息的各种信息,并在发生不需要的事情时使用这些信息创建警报。
好的,我想出了这个python脚本:
#!/usr/bin/python
# sudo pip install boto3 if you don't have it on your machine.
import boto3
def generate(key, value):
"""
Creates a nicely formatted Key(Value) item for output
"""
return '{}="{}"'.format(key, value)
#return '{}={}'.format(key, value)
def main():
ec2 = boto3.resource('ec2', region_name="us-west-2")
volumes = ec2.volumes.all()
for vol in volumes:
# You don't need to wrap everything in `str` unless it is not a string
# By default most things will come back as a string
# unless they are very obviously not (complex, date time, etc)
# but since we are printing these (and formatting them into strings)
# the cast to string will be implicit and we don't need to make it
# explicit
# vol is already a fully returned volume you are essentially DOUBLING
# your API calls when you do this
#iv = ec2.Volume(vol.id)
output_parts = [
# Volume level details
generate('create_time', vol.create_time),
generate('availability_zone', vol.availability_zone),
generate('volume_id', vol.volume_id),
generate('volume_type', vol.volume_type),
generate('state', vol.state),
generate('size', vol.size),
generate('iops', vol.iops),
generate('encrypted', vol.encrypted),
generate('snapshot_id', vol.snapshot_id),
generate('kms_key_id', vol.kms_key_id),
]
for _ in vol.attachments:
# Will get any attachments and since it is a list
# we should write this to handle MULTIPLE attachments
output_parts.extend([
generate('InstanceId', _.get('InstanceId')),
generate('InstanceVolumeState', _.get('State')),
generate('DeleteOnTermination', _.get('DeleteOnTermination')),
generate('Device', _.get('Device')),
])
# only process when there are tags to process
if vol.tags:
for _ in vol.tags:
# Get all of the tags
output_parts.extend([
generate(_.get('Key'), _.get('Value')),
])
# output everything at once..
print ','.join(output_parts)
if __name__ == '__main__':
main()
此脚本将与AWS EC2 EBS卷对话并输出它可以找到的所有值(通常是您在AWS EC2 EBS卷控制台中看到的)并将该信息格式化为有意义的CSV格式,我将其重定向到.csv日志文件。 我们不希望一直运行python脚本(AWS API限制/成本因素)。
所以,一旦创建了.csv文件,我创建了这个小的shell脚本,我将在 Telegraf的exec插件的部分中设置。
Telegraf exec插件中设置的 Shell脚本 /tmp/aws-vol-info.sh
是:
#!/bin/bash
cat /tmp/aws-vol-info.csv
使用exec插件(/etc/telegraf/telegraf.d/exec-plugin-aws-info.conf
)创建的Telegraf配置文件:
#--- https://github.com/influxdata/telegraf/tree/master/plugins/inputs/exec
[[inputs.exec]]
commands = ["/tmp/aws-vol-info.sh"]
## Timeout for each command to complete.
timeout = "5s"
# Data format to consume.
# NOTE json only reads numerical measurements, strings and booleans are ignored.
data_format = "influx"
name_suffix = "_telegraf_execplugin"
我调整 .py(生成函数的Python脚本)以生成以下三种类型的输出格式(.csv文件)并希望测试在启用配置文件( /etc/telegraf/telegraf.d/catch-aws-ebs-info.conf )并重新启动telegraf
之前,telegraf将如何处理此数据服务。
格式1:(每个值包含双引号"
)
create_time="2017-01-09 23:24:29.428000+00:00",availability_zone="us-east-2b",volume_id="vol-058e1d47dgh721121",volume_type="gp2",state="in-use",size="8",iops="100",encrypted="False",snapshot_id="snap-06h1h1b91bh662avn",kms_key_id="None",InstanceId="i-0jjb1boop26f42f50",InstanceVolumeState="attached",DeleteOnTermination="True",Device="/dev/sda1",Name="[company-2b-app90] secondary",hostname="company-2b-app90-i-0jjb1boop26f42f50",high_availability="1",mirror="secondary",cluster="company",autoscale="true",role="app"
在telegraf目录上测试telegraf
配置会给我以下错误。
命令:$ telegraf --config-directory=/etc/telegraf --test --input-filter=exec
[vagrant@myvagrant ~] $ telegraf --config-directory=/etc/telegraf --test --input-filter=exec
2017/03/10 00:37:48 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.exec, Collection 1
2017-03-10T00:37:48Z E! Errors encountered: [ metric parsing error, reason: [invalid field format], buffer: [create_time="2017-01-09 23:24:29.428000+00:00",availability_zone="us-east-2b",volume_id="vol-058e1d47dgh721121",volume_type="gp2",state="in-use",size="8",iops="100",encrypted="False",snapshot_id="snap-06h1h1b91bh662avn",kms_key_id="None",InstanceId="i-0jjb1boop26f42f50",InstanceVolumeState="attached",DeleteOnTermination="True",Device="/dev/sda1",Name="[company-2b-app90] secondary",hostname="company-2b-app90-i-0jjb1boop26f42f50",high_availability="1",mirror="secondary",cluster="company",autoscale="true",role="app"], index: [372]]
[vagrant@myvagrant ~] $
格式2:(没有任何"
双引号)
create_time=2017-01-09 23:24:29.428000+00:00,availability_zone=us-east-2b,volume_id=vol-058e1d47dgh721121,volume_type=gp2,state=in-use,size=8,iops=100,encrypted=False,snapshot_id=snap-06h1h1b91bh662avn,kms_key_id=None,InstanceId=i-0jjb1boop26f42f50,InstanceVolumeState=attached,DeleteOnTermination=True,Device=/dev/sda1,Name=[company-2b-app90] secondary,hostname=company-2b-app90-i-0jjb1boop26f42f50,high_availability=1,mirror=secondary,cluster=company,autoscale=true,role=app
在测试Telegraf的exec插件配置时,获得相同的错误:
2017/03/10 00:45:01 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.exec, Collection 1
2017-03-10T00:45:01Z E! Errors encountered: [ metric parsing error, reason: [invalid value], buffer: [create_time=2017-01-09 23:24:29.428000+00:00,availability_zone=us-east-2b,volume_id=vol-058e1d47dgh721121,volume_type=gp2,state=in-use,size=8,iops=100,encrypted=False,snapshot_id=snap-06h1h1b91bh662avn,kms_key_id=None,InstanceId=i-0jjb1boop26f42f50,InstanceVolumeState=attached,DeleteOnTermination=True,Device=/dev/sda1,Name=[company-2b-app90] secondary,hostname=company-2b-app90-i-0jjb1boop26f42f50,high_availability=1,mirror=secondary,cluster=company,autoscale=true,role=app], index: [63]]
格式3:(此格式在值中没有任何"
双引号和空格字符。具有
_
字符的替换空格。
create_time=2017-01-09_23:24:29.428000+00:00,availability_zone=us-east-2b,volume_id=vol-058e1d47dgh721121,volume_type=gp2,state=in-use,size=8,iops=100,encrypted=False,snapshot_id=snap-06h1h1b91bh662avn,kms_key_id=None,InstanceId=i-0jjb1boop26f42f50,InstanceVolumeState=attached,DeleteOnTermination=True,Device=/dev/sda1,Name=[company-2b-app90]_secondary,hostname=company-2b-app90-i-0jjb1boop26f42f50,high_availability=1,mirror=secondary,cluster=company,autoscale=true,role=app
仍然无效,得到同样的错误:
[vagrant@myvagrant ~] $ telegraf --config-directory=/etc/telegraf --test --input-filter=exec
2017/03/10 00:50:30 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.exec, Collection 1
2017-03-10T00:50:30Z E! Errors encountered: [ metric parsing error, reason: [missing fields], buffer: [create_time=2017-01-09_23:24:29.428000+00:00,availability_zone=us-east-2b,volume_id=vol-058e1d47dgh721121,volume_type=gp2,state=in-use,size=8,iops=100,encrypted=False,snapshot_id=snap-06h1h1b91bh662avn,kms_key_id=None,InstanceId=i-0jjb1boop26f42f50,InstanceVolumeState=attached,DeleteOnTermination=True,Device=/dev/sda1,Name=[company-2b-app90]_secondary,hostname=company-2b-app90-i-0jjb1boop26f42f50,high_availability=1,mirror=secondary,cluster=company,autoscale=true,role=app], index: [476]]
格式4 :如果我按照此页面关注涌入线协议:https://docs.influxdata.com/influxdb/v1.2/write_protocols/line_protocol_tutorial/
awsebs,Name=[company-2b-app90]_secondary,hostname=company-2b-app90-i-0jjb1boop26f42f50,high_availability=1,mirror=secondary,cluster=company,autoscale=true,role=app create_time=2017-01-09_23:24:29.428000+00:00,availability_zone=us-east-2b,volume_id=vol-058e1d47dgh721121,volume_type=gp2,state=in-use,size=8,iops=100,encrypted=False,snapshot_id=snap-06h1h1b91bh662avn,kms_key_id=None,InstanceId=i-0jjb1boop26f42f50,InstanceVolumeState=attached,DeleteOnTermination=True,Device=/dev/sda1
我收到此错误:
[vagrant@myvagrant ~] $ telegraf --config-directory=/etc/telegraf --test --input-filter=exec
2017/03/10 02:34:30 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.exec, Collection 1
2017-03-10T02:34:30Z E! Errors encountered: [ invalid number]
如何我可以摆脱这个错误并让telegraf与exec插件(运行.sh脚本)一起工作吗?
其他信息:
Python脚本每天运行一次/两次(通过cron),telegraf将每1分钟运行一次(运行exec插件 - 运行.sh脚本 - 这将捕获.csv文件,以便telegraf可以在< em>涌入数据格式)。
答案 0 :(得分:3)
看起来规则非常严格,我应该仔细看看。
您可以使用的任何程序的输出语法必须匹配或遵循下面显示的 INFLUX LINE PROTOCOL 格式以及随附的所有 RULES 。< / p>
例如:
weather,location=us-midwest temperature=82 1465839830100400200
| -------------------- -------------- |
| | | |
| | | |
+-----------+--------+-+---------+-+---------+
|measurement|,tag_set| |field_set| |timestamp|
+-----------+--------+-+---------+-+---------+
您可以在此处详细了解测量,标记,字段和可选(时间戳)的内容:https://docs.influxdata.com/influxdb/v1.2/write_protocols/line_protocol_tutorial/
重要规则:
1)测量和标记集之间必须有,
且没有空格。
2)标签集和字段集之间必须有个空格。
3)对于标记键,标记值和字段键,如果要转义测量名称,标记或字段集名称及其值中的任何字符,则始终使用反斜杠字符\来转义!
4)您无法使用\
\
5)Line Protocol处理表情符号没有问题:)
6)可选
中的TAG / TAG集(标记为逗号分隔)7)FIELD / FIELD集(字段,逗号分隔) - 每行至少需要一个。
8) TIMESTAMP (格式中显示的最后一个值)为可选。
9)非常重要的引用规则如下:
a)从不 双重或单引号 时间戳。它不是有效的线路协议。 &#39; 123123131312313&#39;或&#34; 1231313213131&#34;如果#有效,那就不会工作。
b)从不 单引号 字段值(即使它们是字符串!)。它也不是有效的线路协议。即fieldname =&#39; giga&#39;不会工作。
c)不 双重或单引号 测量名称,标记键,标记值和字段键。 注意:这确实说!!!标签值!!!!好小心
d)不要 双引号 字段值,它们只有浮点数,整数或布尔格式,否则InfluxDB会假设那些值是字符串。
e)字符串的双引号 字段值。
f)和最重要的一个(将使您免于 BALD ):如果设置了没有双引号的FIELD值/ 即你认为它在一行中是一个整数值或浮点数(例如:任何人都会说字段大小或 iops )和其他一些行(如果您设置了非整数值(即字符串),那么telegraf将使用 exec插件读取/解析文件中的任何位置,然后您将收到以下错误消息遇到错误:[无效数字错误。
所以要修复它, RULE 是如果FIELD键的任何可能的FIELD值是字符串,那么你必须确保使用"
来包装它(在每一行中),在某些行中它是否具有值 1,200或1.5 并不重要(例如:iops可以是1
,5
),而在某些其他行中,值(iops
可以是None
)。
错误讯息: Errors encountered: [ invalid number
[vagrant@myvagrant ~] $ telegraf --config-directory=/etc/telegraf --test --input-filter=exec
2017/03/10 11:13:18 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.exec, Collection 1
2017-03-10T11:13:18Z E! Errors encountered: [ invalid number metric parsing error, reason: [invalid field format], buffer: [awsebsvol,host=myvagrant ], index: [25]]
所以,经过所有这些学习之后,我很清楚,首先我错过了Influx Line协议格式,而且 RULES !!
现在,我希望我的python脚本生成的输出应该是这样的(根据INFLUX LINE协议)。您只需更改.sh文件并使用sed "s/^/awsec2ebs,/"
或sed "s/^/awsec2ebs,sourcehost=$(hostname) /"
(注意:关闭sed /
字符前的空格),然后您可以"
周围任何键=值对。我确实将.py文件更改为"
和size
字段不使用iops
。
无论如何,如果输出是这样的:
awsec2ebs,volume_id=vol-058e1d47dgh721121 create_time="2017-01-09 23:24:29.428000+00:00",availability_zone="us-east-2b",volume_type="gp2",state="in-use",size="8",iops="100",encrypted="False",snapshot_id="snap-06h1h1b91bh662avn",kms_key_id="None",InstanceId="i-0jjb1boop26f42f50",InstanceVolumeState="attached",DeleteOnTermination="True",Device="/dev/sda1",Name="[company-2b-app90] secondary",hostname="company-2b-app90-i-0jjb1boop26f42f50",high_availability="1",mirror="secondary",cluster="company",autoscale="true",role="app"
在上面的最终工作解决方案中,我创建了一个名为awsec2ebs
的测量,然后在此测量和标记键,
之间提供了volume_id
,对于标记值,我没有使用任何{{} 1}}或'
引号然后我给了一个"
空格字符(因为我现在只想要一个标记,否则你可以使用命令分隔方式和遵循规则使用更多标记)标记集之间和字段集。
最后运行命令:
就像一个神子!
$ telegraf --config-directory=/etc/telegraf --test --input-filter=exec
在上面的示例中,2017/03/10 03:33:54 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.exec, Collection 1
> awsec2ebs_telegraf_execplugin,volume_id=vol-058e1d47dgh721121,host=myvagrant volume_type="gp2",iops="100",kms_key_id="None",role="app",size="8",encrypted="False",InstanceId="i-0jjb1boop26f42f50",InstanceVolumeState="attached",Name="[company-2b-app90] secondary",snapshot_id="snap-06h1h1b91bh662avn",DeleteOnTermination="True",mirror="secondary",cluster="company",autoscale="true",high_availability="1",create_time="2017-01-09 23:24:29.428000+00:00",availability_zone="us-east-2b",state="in-use",Device="/dev/sda1",hostname="company-2b-app90-i-0jjb1boop26f42f50" 1489116835000000000
[vagrant@myvagrant ~] $ echo $?
0
是唯一一个始终为数字/数字值的字段,因此我们不需要将其与size
一起包装,但它不是由你决定。回想一下上面的MOST IMPORTANT规则以及它产生的错误。
所以最终的python文件是:
"
最终的aws-vol-info.sh是:
#!/usr/bin/python
#Do `sudo pip install boto3` first
import boto3
def generate(key, value, qs, qe):
"""
Creates a nicely formatted Key(Value) item for output
"""
return '{}={}{}{}'.format(key, qs, value, qe)
def main():
ec2 = boto3.resource('ec2', region_name="us-west-2")
volumes = ec2.volumes.all()
for vol in volumes:
# You don't need to wrap everything in `str` unless it is not a string
# By default most things will come back as a string
# unless they are very obviously not (complex, date time, etc)
# but since we are printing these (and formatting them into strings)
# the cast to string will be implicit and we don't need to make it
# explicit
# vol is already a fully returned volume you are essentially DOUBLING
# your API calls when you do this
#iv = ec2.Volume(vol.id)
output_parts = [
# Volume level details
generate('volume_id', vol.volume_id, '"', '"'),
generate('create_time', vol.create_time, '"', '"'),
generate('availability_zone', vol.availability_zone, '"', '"'),
generate('volume_type', vol.volume_type, '"', '"'),
generate('state', vol.state, '"', '"'),
generate('size', vol.size, '', ''),
#The following vol.iops variable can be a number or None so you must wrap it with double quotes otherwise "invalid number" error will come.
generate('iops', vol.iops, '"', '"'),
generate('encrypted', vol.encrypted, '"', '"'),
generate('snapshot_id', vol.snapshot_id, '"', '"'),
generate('kms_key_id', vol.kms_key_id, '"', '"'),
]
for _ in vol.attachments:
# Will get any attachments and since it is a list
# we should write this to handle MULTIPLE attachments
output_parts.extend([
generate('InstanceId', _.get('InstanceId'), '"', '"'),
generate('InstanceVolumeState', _.get('State'), '"', '"'),
generate('DeleteOnTermination', _.get('DeleteOnTermination'), '"', '"'),
generate('Device', _.get('Device'), '"', '"'),
])
# only process when there are tags to process
if vol.tags:
for _ in vol.tags:
# Get all of the tags
output_parts.extend([
generate(_.get('Key'), _.get('Value'), '"', '"'),
])
# output everything at once..
print ','.join(output_parts)
if __name__ == '__main__':
main()
最终的telegraf exec插件配置文件(#!/bin/bash
cat aws-vol-info.csv | sed "s/^/awsebsvol,host=`hostname|head -1|sed "s/[ \t][ \t]*/_/g"` /"
)为.conf提供任何名称:
/etc/telegraf/telegraf.d/exec-plugin-aws-info.conf
运行,现在一切正常!
#--- https://github.com/influxdata/telegraf/tree/master/plugins/inputs/exec
[[inputs.exec]]
commands = ["/some/valid/path/where/csvfileexists/aws-vol-info.sh"]
## Timeout for each command to complete.
timeout = "5s"
# Data format to consume.
# NOTE json only reads numerical measurements, strings and booleans are ignored.
data_format = "influx"
name_suffix = "_telegraf_exec"