需要缩短Grok Pattern

时间:2017-11-15 18:10:28

标签: logstash logstash-grok

我正在使用Grok模式来解析防火墙日志,并且在使用Grok Debugger时Grok模式可以正常工作。日志数据发生了变化,我创建了模式以匹配每个更改。我的问题是ELK正在为解析的数据生成多个重复字段。我确信有一种方法可以缩短我的格鲁克模式但是在这一点上我无法弄清楚如何。所以任何帮助都会很棒。请参阅下面的示例:

示例日志:

Nov 15 12:18:31 removed_ip 2017:11:15-12:18:31 sophie ulogd[23109]: id="2001" severity="info" sys="SecureNet" sub="packetfilter" name="Packet dropped" action="drop" fwrule="60001" initf="eth3" srcmac="removed_mac" dstmac="removed_mac" srcip="removed_ip" dstip="removed_ip" proto="6" length="40" tos="0x00" prec="0x00" ttl="247" srcport="58261" dstport="5315" tcpflags="SYN"
Nov 15 12:33:01 removed_ip 2017:11:15-12:33:01 sophie ulogd[23109]: id="2001" severity="info" sys="SecureNet" sub="packetfilter" name="Packet dropped" action="drop" fwrule="60003" outitf="wlan1" srcmac="removed_mac" srcip="removed_ip" dstip="removed_ip" proto="6" length="40" tos="0x00" prec="0x00" ttl="64" srcport="443" dstport="49824" tcpflags="RST" 
Nov 15 12:20:29 removed_ip 2017:11:15-12:20:29 sophie httpproxy[6835]: id="0001" severity="info" sys="SecureWeb" sub="http" name="http access" action="pass" method="GET" srcip="removed_ip" dstip="removed_ip" user="" group="" ad_domain="" statuscode="200" cached="0" profile="REF_DefaultHTTPProfile (Default Web Filter Profile)" filteraction="REF_DefaultHTTPCFFAction (Default content filter action)" size="371" request="0xd3a9ac00" url="http://removed_ip/icingaweb2/monitoring/tactical?view=compact" referer="http://removed_ip/icingaweb2/dashboard" error="" authtime="0" dnstime="218" cattime="0" avscantime="2584" fullreqtime="15393756" device="0" auth="0" ua="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36" exceptions="" overridecategory="1" overridereputation="1" category="105" reputation="trusted" categoryname="Business" country="United States" sandbox="-" content-type="text/xml"
Nov 15 12:30:33 removed_ip 2017:11:15-12:30:33 sophie httpproxy[6835]: id="0001" severity="info" sys="SecureWeb" sub="http" name="http access" action="pass" method="CONNECT" srcip="removed_ip" dstip="removed_ip" user="" group="" ad_domain="" statuscode="200" cached="0" profile="REF_DefaultHTTPProfile (Default Web Filter Profile)" filteraction="REF_DefaultHTTPCFFAction (Default content filter action)" size="11571" request="0xd21c1800" url="https://www.google.com/" referer="" error="" authtime="0" dnstime="1" cattime="97" avscantime="0" fullreqtime="361956728" device="0" auth="0" ua="" exceptions="" category="145" reputation="neutral" categoryname="Search Engines" country="United States" application="google" app-id="182"

样本Grok模式:

filter { 
if [type] == "utm"{
grok {
  match => {"message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" fwrule=\"%{INT:fwrule}\" initf=\"%{NOTSPACE:initf}\" outitf=\"%{NOTSPACE:outitf}\" mark=\"%{WORD:mark}\" app=\"%{WORD:app}\" srcmac=\"%{MAC:srcmac}\" dstmac=\"%{MAC:dstmac}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" proto=\"%{WORD:protocol}\" length=\"%{INT:length}\" tos=\"%{DATA:tos}\" prec=\"%{DATA:prec}\" ttl=\"%{INT:ttl}\" srcport=\"%{INT:srcport}\" dstport=\"%{INT:dstport}\" "}
}
grok {
  match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" fwrule=\"%{INT:fwrule}\" initf=\"%{NOTSPACE:initf}\" outitf=\"%{NOTSPACE:outitf}\" srcmac=\"%{MAC:srcmac}\" dstmac=\"%{MAC:dstmac}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" proto=\"%{WORD:protocol}\" length=\"%{INT:length}\" tos=\"%{DATA:tos}\" prec=\"%{DATA:prec}\" ttl=\"%{INT:ttl}\" srcport=\"%{INT:srcport}\" dstport=\"%{INT:dstport}\" "}
 }
grok {
  match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" fwrule=\"%{INT:fwrule}\" initf=\"%{NOTSPACE:initf}\" outitf=\"%{NOTSPACE:outitf}\" srcmac=\"%{MAC:srcmac}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" proto=\"%{WORD:protocol}\" length=\"%{INT:length}\" tos=\"%{DATA:tos}\" prec=\"%{DATA:prec}\" ttl=\"%{INT:ttl}\" srcport=\"%{INT:srcport}\" dstport=\"%{INT:dstport}\" "}
 }
grok {
  match => {"message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" fwrule=\"%{INT:fwrule}\" outitf=\"%{NOTSPACE:outitf}\" mark=\"%{DATA:mark}\" app=\"%{DATA:app}\" srcmac=\"%{MAC:srcmac}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" proto=\"%{WORD:protocol}\" length=\"%{INT:length}\" tos=\"%{DATA:tos}\" prec=\"%{DATA:prec}\" ttl=\"%{INT:ttl}\" srcport=\"%{INT:srcport}\" dstport=\"%{INT:dstport}\" "}
 }
grok {
  match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" fwrule=\"%{INT:fwrule}\" initf=\"%{NOTSPACE:initf}\" srcmac=\"%{MAC:srcmac}\" dstmac=\"%{MAC:dstmac}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" proto=\"%{WORD:protocol}\" length=\"%{INT:length}\" tos=\"%{DATA:tos}\" prec=\"%{DATA:prec}\" ttl=\"%{INT:ttl}\" srcport=\"%{INT:srcport}\" dstport=\"%{INT:dstport}\" "}
 }
grok {
  match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" fwrule=\"%{INT:fwrule}\" outitf=\"%{NOTSPACE:outitf}\" srcmac=\"%{MAC:srcmac}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" proto=\"%{WORD:protocol}\" length=\"%{INT:length}\" tos=\"%{DATA:tos}\" prec=\"%{DATA:prec}\" ttl=\"%{INT:ttl}\" srcport=\"%{INT:srcport}\" dstport=\"%{INT:dstport}\" "}
 }
grok {
  match => {"message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" fwrule=\"%{INT:fwrule}\" initf=\"%{NOTSPACE:initf}\" outitf=\"%{NOTSPACE:outitf}\" srcmac=\"%{MAC:srcmac}\" dstmac=\"%{MAC:dstmac}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" proto=\"%{WORD:protocol}\" length=\"%{INT:length}\" tos=\"%{DATA:tos}\" prec=\"%{DATA:prec}\" ttl=\"%{INT:ttl}\" srcport=\"%{INT:srcport}\" dstport=\"%{INT:dstport}\" tcpflags=\"%{DATA:tcpflags}\" \"(,)\" "}
}
grok {
 match => {"message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" fwrule=\"%{INT:fwrule}\" initf=\"%{NOTSPACE:initf}\" srcmac=\"%{MAC:srcmac}\" dstmac=\"%{MAC:dstmac}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" proto=\"%{WORD:protocol}\" length=\"%{INT:length}\" tos=\"%{DATA:tos}\" prec=\"%{DATA:prec}\" ttl=\"%{INT:ttl}\" srcport=\"%{INT:srcport}\" dstport=\"%{INT:dstport}\" tcpflags=\"%{DATA:tcpflags}\" "}
}
grok {
 match => {"message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" fwrule=\"%{INT:fwrule}\" outitf=\"%{NOTSPACE:outitf}\" srcmac=\"%{MAC:srcmac}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" proto=\"%{WORD:protocol}\" length=\"%{INT:length}\" tos=\"%{DATA:tos}\" prec=\"%{DATA:prec}\" ttl=\"%{INT:ttl}\" srcport=\"%{INT:srcport}\" dstport=\"%{INT:dstport}\" tcpflags=\"%{DATA:tcpflags}\" "}
}
grok {
 match => {"message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" fwrule=\"%{INT:fwrule}\" initf=\"%{NOTSPACE:initf}\" outitf=\"%{NOTSPACE:outitf}\" mark=\"%{WORD:mark}\" app=\"%{WORD:app}\" srcmac=\"%{MAC:srcmac}\" dstmac=\"%{MAC:dstmac}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" proto=\"%{WORD:protocol}\" length=\"%{INT:length}\" tos=\"%{DATA:tos}\" prec=\"%{DATA:prec}\" ttl=\"%{INT:ttl}\" srcport=\"%{INT:srcport}\" dstport=\"%{INT:dstport}\" tcpflags=\"%{DATA:tcpflags}\" "}
}
grok {
 match => {"message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" fwrule=\"%{INT:fwrule}\" initf=\"%{NOTSPACE:initf}\" outitf=\"%{NOTSPACE:outitf}\" mark=\"%{WORD:mark}\" srcmac=\"%{MAC:srcmac}\" dstmac=\"%{MAC:dstmac}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" proto=\"%{WORD:protocol}\" length=\"%{INT:length}\" tos=\"%{DATA:tos}\" prec=\"%{DATA:prec}\" ttl=\"%{INT:ttl}\" srcport=\"%{INT:srcport}\" dstport=\"%{INT:dstport}\" tcpflags=\"%{DATA:tcpflags}\" "}
}
}
if "httpproxy" in [message]{
grok {
match => {"message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" method=\"%{WORD:method}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" user=\"%{DATA:user}\" group=\"%{DATA:group}\" ad_domain=\"%{DATA:ad_domain}\" statuscode=\"%{INT:statuscode}\" cached=\"%{INT:cached}\" profile=\"%{DATA:profile}\" filteraction=\"%{DATA:filteraction}\" size=\"%{INT:size}\" request=\"%{BASE16FLOAT:request}\" url=\"%{URI:url}\" referer=\"%{DATA:referer}\" error=\"%{DATA:error}\" authtime=\"%{INT:authtime}\" dnstime=\"%{INT:dnstime}\" cattime=\"%{INT:cattime}\" avscantime=\"%{INT:avscantime}\" fullreqtime=\"%{INT:fullreqtime}\" device=\"%{INT:device}\" auth=\"%{INT:auth}\" ua=\"%{DATA:ua}\" exceptions=\"%{DATA:exceptions}\" category=\"%{INT:category}\" reputation=\"%{WORD:reputation}\" categoryname=\"%{DATA:categoryname}\" country=\"%{DATA:country}\" application=\"%{WORD:application}\" app-id=\"%{INT:app-id}\" sandbox=\"%{DATA:sandbox}\" content-type=\"%{DATA:content-type}\" "}
}
grok {
match => {"message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" method=\"%{WORD:method}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" user=\"%{DATA:user}\" group=\"%{DATA:group}\" ad_domain=\"%{DATA:ad_domain}\" statuscode=\"%{INT:statuscode}\" cached=\"%{INT:cached}\" profile=\"%{DATA:profile}\" filteraction=\"%{DATA:filteraction}\" size=\"%{INT:size}\" request=\"%{BASE16FLOAT:request}\" url=\"%{URI:url}\" referer=\"%{DATA:referer}\" error=\"%{DATA:error}\" authtime=\"%{INT:authtime}\" dnstime=\"%{INT:dnstime}\" cattime=\"%{INT:cattime}\" avscantime=\"%{INT:avscantime}\" fullreqtime=\"%{INT:fullreqtime}\" device=\"%{INT:device}\" auth=\"%{INT:auth}\" ua=\"%{DATA:ua}\" exceptions=\"%{DATA:exceptions}\" overridecategory=\"%{INT:overridecategory}\" overridereputation=\"%{INT:overridereputation}\" category=\"%{INT:category}\" reputation=\"%{WORD:reputation}\" categoryname=\"%{DATA:categoryname}\" country=\"%{DATA:country}\" sandbox=\"%{DATA:sandbox}\" content-type=\"%{DATA:content-type}\" "}
}
grok {
 match => {"message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" method=\"%{WORD:method}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" user=\"%{DATA:user}\" group=\"%{DATA:group}\" ad_domain=\"%{DATA:ad_domain}\" statuscode=\"%{INT:statuscode}\" cached=\"%{INT:cached}\" profile=\"%{DATA:profile}\" filteraction=\"%{DATA:filteraction}\" size=\"%{INT:size}\" request=\"%{BASE16FLOAT:request}\" url=\"%{URI:url}\" referer=\"%{DATA:referer}\" error=\"%{DATA:error}\" authtime=\"%{INT:authtime}\" dnstime=\"%{INT:dnstime}\" cattime=\"%{INT:cattime}\" avscantime=\"%{INT:avscantime}\" fullreqtime=\"%{INT:fullreqtime}\" device=\"%{INT:device}\" auth=\"%{INT:auth}\" ua=\"%{DATA:ua}\" exceptions=\"%{DATA:exceptions}\" category=\"%{INT:category}\" reputation=\"%{WORD:reputation}\" categoryname=\"%{DATA:categoryname}\" sandbox=\"%{DATA:sandbox}\" content-type=\"%{DATA:content-type}\" "}
}
grok {
match => {"message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" method=\"%{WORD:method}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" user=\"%{DATA:user}\" group=\"%{DATA:group}\" ad_domain=\"%{DATA:ad_domain}\" statuscode=\"%{INT:statuscode}\" cached=\"%{INT:cached}\" profile=\"%{DATA:profile}\" filteraction=\"%{DATA:filteraction}\" size=\"%{INT:size}\" request=\"%{BASE16FLOAT:request}\" url=\"%{URI:url}\" referer=\"%{DATA:referer}\" error=\"%{DATA:error}\" authtime=\"%{INT:authtime}\" dnstime=\"%{INT:dnstime}\" cattime=\"%{INT:cattime}\" avscantime=\"%{INT:avscantime}\" fullreqtime=\"%{INT:fullreqtime}\" device=\"%{INT:device}\" auth=\"%{INT:auth}\" ua=\"%{DATA:ua}\" exceptions=\"%{DATA:exceptions}\" "}
}
grok {
match => {"message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" method=\"%{WORD:method}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" user=\"%{DATA:user}\" group=\"%{DATA:group}\" ad_domain=\"%{DATA:ad_domain}\" statuscode=\"%{INT:statuscode}\" cached=\"%{INT:cached}\" profile=\"%{DATA:profile}\" filteraction=\"%{DATA:filteraction}\" size=\"%{INT:size}\" request=\"%{BASE16FLOAT:request}\" url=\"%{URI:url}\" referer=\"%{DATA:referer}\" error=\"%{DATA:error}\" authtime=\"%{INT:authtime}\" dnstime=\"%{INT:dnstime}\" cattime=\"%{INT:cattime}\" avscantime=\"%{INT:avscantime}\" fullreqtime=\"%{INT:fullreqtime}\" device=\"%{INT:device}\" auth=\"%{INT:auth}\" ua=\"%{DATA:ua}\" exceptions=\"%{DATA:exceptions}\" overridecategory=\"%{INT:overridecategory}\" overridereputation=\"%{INT:overridereputation}\" country=\"%{DATA:country}\" "}
}
grok {
match => {"message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" method=\"%{WORD:method}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" user=\"%{DATA:user}\" group=\"%{DATA:group}\" ad_domain=\"%{DATA:ad_domain}\" statuscode=\"%{INT:statuscode}\" cached=\"%{INT:cached}\" profile=\"%{DATA:profile}\" filteraction=\"%{DATA:filteraction}\" size=\"%{INT:size}\" request=\"%{BASE16FLOAT:request}\" url=\"%{URI:url}\" referer=\"%{DATA:referer}\" error=\"%{DATA:error}\" authtime=\"%{INT:authtime}\" dnstime=\"%{INT:dnstime}\" cattime=\"%{INT:cattime}\" avscantime=\"%{INT:avscantime}\" fullreqtime=\"%{INT:fullreqtime}\" device=\"%{INT:device}\" auth=\"%{INT:auth}\" ua=\"%{DATA:ua}\" exceptions=\"%{DATA:exceptions}\" overridecategory=\"%{INT:overridecategory}\" overridereputation=\"%{INT:overridereputation}\" category=\"%{INT:category}\" reputation=\"%{WORD:reputation}\" categoryname=\"%{DATA:categoryname}\" country=\"%{DATA:country}\" "}
}
grok {
match => {"message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" method=\"%{WORD:method}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" user=\"%{DATA:user}\" group=\"%{DATA:group}\" ad_domain=\"%{DATA:ad_domain}\" statuscode=\"%{INT:statuscode}\" cached=\"%{INT:cached}\" profile=\"%{DATA:profile}\" filteraction=\"%{DATA:filteraction}\" size=\"%{INT:size}\" request=\"%{BASE16FLOAT:request}\" url=\"%{URI:url}\" referer=\"%{DATA:referer}\" error=\"%{DATA:error}\" authtime=\"%{INT:authtime}\" dnstime=\"%{INT:dnstime}\" cattime=\"%{INT:cattime}\" avscantime=\"%{INT:avscantime}\" fullreqtime=\"%{INT:fullreqtime}\" device=\"%{INT:device}\" auth=\"%{INT:auth}\" ua=\"%{DATA:ua}\" exceptions=\"%{DATA:exceptions}\" category=\"%{INT:category}\" reputation=\"%{WORD:reputation}\" categoryname=\"%{DATA:categoryname}\" country=\"%{DATA:country}\" "}
}
grok {
match => {"message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: id=\"%{INT:id}\" severity=\"%{WORD:severity}\" sys=\"%{WORD:sys}\" sub=\"%{WORD:sub}\" name=\"%{DATA:name}\" action=\"%{DATA:action}\" method=\"%{WORD:method}\" srcip=\"%{IPV4:source_ip}\" dstip=\"%{IPV4:destination_ip}\" user=\"%{DATA:user}\" group=\"%{DATA:group}\" ad_domain=\"%{DATA:ad_domain}\" statuscode=\"%{INT:statuscode}\" cached=\"%{INT:cached}\" profile=\"%{DATA:profile}\" filteraction=\"%{DATA:filteraction}\" size=\"%{INT:size}\" request=\"%{BASE16FLOAT:request}\" url=\"%{URI:url}\" referer=\"%{DATA:referer}\" error=\"%{DATA:error}\" authtime=\"%{INT:authtime}\" dnstime=\"%{INT:dnstime}\" cattime=\"%{INT:cattime}\" avscantime=\"%{INT:avscantime}\" fullreqtime=\"%{INT:fullreqtime}\" device=\"%{INT:device}\" auth=\"%{INT:auth}\" ua=\"%{DATA:ua}\" exceptions=\"%{DATA:exceptions}\" category=\"%{INT:category}\" reputation=\"%{WORD:reputation}\" categoryname=\"%{DATA:categoryname}\" country=\"%{DATA:country}\" application=\"%{WORD:application}\" app-id=\"%{INT:app-id}\" "}
}
}
}

正如您所看到的,我花了一些时间来尝试考虑防火墙将产生的日志条目的每个变体。在大多数情况下,数据是相同的,但在某一点之后它会发生变化。当logstash解析数据时,它会为一个字段生成多个数据。这是重复字段screenshot from ELK

的图片

我的假设是我的格鲁克模式太相似了。我最初尝试过break_on_match =&gt;错误,但产生了相同的重复。

1 个答案:

答案 0 :(得分:1)

与@Phonolog一样,由于你有多个grok过滤器,你的消息将与每个过滤器匹配一次。如果多个过滤器成功地将其模式与您的消息匹配,那么您将拥有该字段的副本。

您可以将包含键值对的部分放在一个字段中,然后在其上使用kv filter,而不是尝试匹配邮件末尾的键值对的每个可能组合。 / p>

看起来像这样:

grok {
  match => {"message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} (?<timestamp>%{YEAR}:%{MONTHNUM}:%{MONTHDAY}-%{HOUR}:%{MINUTE}:%{SECOND}) %{HOSTNAME:logsource} %{WORD:program}\[%{INT:pid}]\: %{GREEDYDATA:kvdata}"}
}

kv {
    source => "kvdata"
    trim_value => "\""
 }
}