Question

我有一堆日志，我用logstash处理什么。在每个条目之间，有一个令人讨厌的行只有一个双引号，我的模式无法解析并导致grokparsefailure。忽略/删除它的最佳方法是什么？

我的输入是

2016-09-18 00:00:02,013 UTC, idf="639b26a731284b43beac8b26f829bcab", message="24308 * thread http-nio-8443-exec-4
24308 > host: localhost:8443
24308 > user-agent: curl/7.40.0
" 
2016-09-18 00:02:35,555 UTC, idf="7d65da6966ec4c26a685b04ec7bfd851", message="24309 * thread http-nio-8443-exec-1
24309 > host: example.com
24309 > user-agent: Mozilla/5.0+(compatible; UptimeRobot/2.0; http://www.uptimerobot.com/)
24309 > x-forwarded-for: 69.162.124.236
" 
2016-09-18 00:07:35,591 UTC, idf="8998b9c5f2de414182ce3143b4ec39af", message="24310 * thread http-nio-8443-exec-10
24310 > HEAD https://example.com/status

我尝试在开头添加另一个grok过滤器，然后删除字段

  grok {
    match => { "message" => "(?<onlyquote>\" )"}
    remove_field => [ "onlyquote" ]
  }

和gsub

  mutate {
    gsub => [ "message" , "\" \n" , "" ]
  }

但这两种方法都会导致更多失败。有什么想法吗？

Answer 1

使用drop过滤器：

if (/^"$/) { // match line contains only "
  drop {}
}

并将其放在grok过滤器

之前

如何删除带引号的行

1 个答案: