Logstash:收到具有不同字符编码的事件

时间:2014-07-01 09:00:39

标签: logstash

使用logstash时我看到如下错误:

Received an event that has a different character encoding than you configured. {:text=>"2014-06-22T11:49:57.832631+02:00 10.17.22.37 date=2014-06-22 time=11:49:55 device_id=LM150D9L23000422 log_id=0312318759 type=statistics pri=information session_id=\\\"s617nnE2019973-s617nnE3019973\\\" client_name=\\\"[<IP address>]\\\" dst_ip=\\\"<ip address>\\\" from=\\\"machin@machin.fr\\\" to=\\\"truc@machin.fr\\\" polid=\\\"0:1:1\\\" domain=\\\"machin.fr\\\" subject=\\\"\\xF0\\xCC\\xC1\\xD4\\xC9 \\xD4\\xCF\\xCC\\xD8\\xCB\\xCF \\xDA\\xC1 \\xD0\\xD2\\xCF\\xC4\\xC1\\xD6\\xC9!\\\" mailer=\\\"mta\\\" resolved=\\\"OK\\\" direction=\\\"in\\\" virus=\\\"\\\" disposition=\\\"Quarantine\\\" classifier=\\\"FortiGuard AntiSpam\\\" message_length=\\\"1024\\\"", :expected_charset=>"UTF-8", :level=>:warn}

我的logstash.conf是:

 input {
    file{
            path => "/var/log/fortimail.log"
        }

}

 filter  {
    grok {
                    # grok-parsing for logs
        }
}
 output {
    elasticsearch {
            host => "10.0.10.62"
            embedded => true
            cluster => "Mastertest"
            node_name => "MasterNode"
            protocol => "http"
    }
}

我不知道应该使用哪种编解码器来处理正确的事件格式? 他的问题出在主题领域。

2 个答案:

答案 0 :(得分:5)

这是因为默认字符集是UTF-8而且传入消息包含不在UTF-8集中的字符

要解决此问题,请使用编解码器和正确的字符集在输入部分设置字符集。例如

file {
            path => "var/log/http/access_log"
            type => apache_access_log
            codec => plain {
                    charset => "ISO-8859-1"
            }
            stat_interval => 60
}

http://logstash.net/docs/1.3.3/codecs/plain

答案 1 :(得分:1)

如果您从外部服务器收到日志,请尝试使用:

input {
   udp {
     port => yourListenPort
     type => inputType
     codec => plain {
       charset => "ISO-8859-1"
     }
   }
}

我有同样的错误,我用过它,有效!