如何在Logstash中使用过滤器设置1:N工作流程?

时间:2013-06-27 14:57:12

标签: elasticsearch amqp logstash statsd

我正在尝试设置一个logstash工作程序,该工作程序从一个amqp / rabbitmq队列获取所有消息,过滤一些消息以发送到statsD,但也将所有消息发送到弹性搜索。以下实现仅不向ElasticSearch发送任何消息。

input {
  rabbitmq {
    host => "amqp-host"
    queue => "elasticsearch"
    key => "elasticsearch"
    exchange => "elasticsearch"
    type => "all"
    durable => true
    auto_delete => false
    exclusive => false
    format => "json_event"
    debug => false
  }
}

filter {
    grep {
      add_tag => "grepped"
      match => ["@message", "Execution of .*? took .* sec"]
    }

    grok {
        tags => ["grepped"]
        add_tag => "grokked"
        pattern => "Execution of %{DATA:command_name} took %{DATA:response_time} sec"
    }

    mutate {
        tags => ["grepped", "grokked"]
        lowercase => [ "command_name" ]
        add_tag => ["mutated"]
    }
}

output {
  elasticsearch_river {
    type => "all"
    rabbitmq_host => "amqp-host"
    debug => false
    durable => true
    persistent => true
    es_host => "es-host"
    exchange => "logstash-elasticsearch"
    exchange_type => "direct"
    index => "logs-%{+YYYY.MM.dd}"
    index_type => "%{@type}"
    queue => "logstash-elasticsearch" 
  }

 statsd {
    type => "command-filter"
    tags => ["grepped", "grokked", "mutated"]
    host => "some.domain.local"
    port => 1234
    sender => ""
    namespace => ""
    timing => ["prefix.%{command_name}.suffix", "%{response_time}"]
    increment => ["prefix.%{command_name}.suffix"]
  }
}

是否有一些过滤器?或者是一种排列标签的方法,以便过滤一些消息,但是所有消息都转发到ES?

3 个答案:

答案 0 :(得分:1)

clone过滤器派上用场了。以下是我生成的配置文件。

input {
  rabbitmq {
    host => "amqp-host"
    queue => "elasticsearch"
    key => "elasticsearch"
    exchange => "elasticsearch"
    type => "all"
    durable => true
    auto_delete => false
    exclusive => false
    format => "json_event"
    debug => false
  }
}

filter {
    clone {
        exclude_tags => ["cloned"]
        clones => ["statsd", "elastic-search"]
        add_tag => ["cloned"]
    }

    grep {
      type => "statsd"
      add_tag => "grepped"
      match => ["@message", "Execution of .*Command took .* sec"]
    }

    grok {
        type => "statsd"
        tags => ["grepped"]
        add_tag => "grokked"
        pattern => "Execution of %{DATA:command_name}Command took %{DATA:response_time} sec"
    }

    mutate {
        type => "statsd"
        tags => ["grepped", "grokked"]
        lowercase => [ "command_name" ]
        add_tag => ["mutated"]
    }
}

output {
  elasticsearch_river {
    type => "all"
    rabbitmq_host => "amqp-host"
    debug => false
    durable => true
    persistent => true
    es_host => "es-host"
    exchange => "logstash-elasticsearch"
    exchange_type => "direct"
    index => "logs-%{+YYYY.MM.dd}"
    index_type => "%{@type}"
    queue => "logstash-elasticsearch" 
  }

  statsd {
    type => "statsd"
    tags => ["grepped", "grokked", "mutated"]
    host => "some.host.local"
    port => 1234
    sender => ""
    namespace => ""
    timing => ["commands.%{command_name}.responsetime", "%{response_time}"]
    increment => ["commands.%{command_name}.requests"]
  }
}

答案 1 :(得分:0)

在您的情况下,clone过滤器实际上是不必要的。

我会推荐几件事:

  • 请简化您的配置,直到您的“基础”工作。暂时不要设置可选参数
  • 确保type符合
  • 使用标签,它们是救生员
  • 当“贪图”时,您应保持一致并始终使用grok或始终使用grep。我的偏好是grok

其他一些提示...... 根据消息设置的标记,选择要对消息执行的操作。以下是我的示例配置代码段:

grok{
   type => "company"
   pattern => ["((status:new))"]
   add_tag => ["company-vprod-status-new", "company-vprod-status-new-%{company}", "company-vprod-status-new-%{candidate_username}"]
   tags=> "company-vprod-status-change"
}
output {
 elasticsearch {
    host => "127.0.0.1"
    type => "company"
    index => "logstash-syslog-%{+YYYY.MM.dd}"
 }
 statsd {
    host => "graphite.test.company.com"
    increment => ["vprod.statuses.new.all", "vprod.statuses.new.%{company}.all"]
    tags => ["company-vprod-status-new"]   
 }
}

另外,请密切关注您的类型。如果将type属性设置为不存在的值,则永远不会触发该块。这就是为什么我更喜欢使用标签,除非有明确的理由使用类型。

答案 2 :(得分:0)

您还可以添加:

drop => false

在grep节结束时(如果你还在使用grep)