Kibana 索引模式重复命中

时间:2021-02-14 14:19:37

标签: elasticsearch logstash kibana elk kibana-7

我是 elk stack 的新手,现在我无法得到我在创建索引模式时的预期。问题来了:

首先,我创建了一个 logstash conf 文件:

input {
    file {
        path => ["/usr/share/logs_data/log_6.log"]
        start_position => "beginning"
    }
}

filter {
    grok {
        match => ["message", "(?<execution_date>\d{4}-\d{2}-\d{2}) (?<execution_time>\d{2}:\d{2}:\d{2})%{GREEDYDATA}/configs/pipelines/(?<crawler_category>([^/])+)/(?<crawler_subcategory>([^-])+)-(?<crawler_name>([^.])+).json"]
    }
}

output {
    elasticsearch {
        hosts => "elasticsearch:9200"
        user => "elastic"
        password => "changeme"
        ecs_compatibility => disabled
        index => "log_6"
    }
}

然后,我更新了这个文件,log_6.log,三次:

  • 首先:

log_6.log

2021-01-13 12:41:17,756 luigi-logger[11977] INFO: *****> Chamando "UpdateDBTables(/home/ubuntu/my_folder/my_software/configs/pipelines/7/subcat7-my_file_7.json)"

在 kibana 的 Discover 选项中:one 点击

crawler_category:7 host:fe299a799115 execution_date:Jan 12, 2021 @ 21:00:00.000 @version:1 path:/usr/share/logs_data/log_6.log @timestamp:Feb 14, 2021 @ 11:02:32.968 crawler_subcategory:subcat7 crawler_name:my_file_7 execution_time:12:41:17 message:2021-01-13 12:41:17,756 luigi-logger[11977] INFO: *****> Chamando "UpdateDBTables(/home/ubuntu/my_folder/my_software/configs/pipelines/7/subcat7-my_file_7.json)" _id:uEvZoHcBZfIv8WwqqBlB _type:_doc _index:log_6 _score:0

  • 第二:

log_6.log

2021-01-13 12:41:17,756 luigi-logger[11977] INFO: *****> Chamando "UpdateDBTables(/home/ubuntu/my_folder/my_software/configs/pipelines/7/subcat7-my_file_7.json)"
2021-01-13 12:41:17,756 luigi-logger[11977] INFO: *****> Chamando "UpdateDBTables(/home/ubuntu/my_folder/my_software/configs/pipelines/8/subcat7-my_file_8.json)"

在 kibana 的 Discover 选项中:三个点击

crawler_category:7 host:fe299a799115 execution_date:Jan 12, 2021 @ 21:00:00.000 @version:1 path:/usr/share/logs_data/log_6.log @timestamp:Feb 14, 2021 @ 11:02:32.968 crawler_subcategory:subcat7 crawler_name:my_file_7 execution_time:12:41:17 message:2021-01-13 12:41:17,756 luigi-logger[11977] INFO: *****> Chamando "UpdateDBTables(/home/ubuntu/my_folder/my_software/configs/pipelines/7/subcat7-my_file_7.json)" _id:uEvZoHcBZfIv8WwqqBlB _type:_doc _index:log_6 _score:0
crawler_category:7 host:fe299a799115 execution_date:Jan 12, 2021 @ 21:00:00.000 @version:1 path:/usr/share/logs_data/log_6.log @timestamp:Feb 14, 2021 @ 11:04:01.510 crawler_subcategory:subcat7 crawler_name:my_file_7 execution_time:12:41:17 message:2021-01-13 12:41:17,756 luigi-logger[11977] INFO: *****> Chamando "UpdateDBTables(/home/ubuntu/my_folder/my_software/configs/pipelines/7/subcat7-my_file_7.json)" _id:ukvaoHcBZfIv8Wwq-hrX _type:_doc _index:log_6 _score:0
crawler_category:8 host:fe299a799115 execution_date:Jan 12, 2021 @ 21:00:00.000 @version:1 path:/usr/share/logs_data/log_6.log @timestamp:Feb 14, 2021 @ 11:04:01.515 crawler_subcategory:subcat7 crawler_name:my_file_8 execution_time:12:41:17 message:2021-01-13 12:41:17,756 luigi-logger[11977] INFO: *****> Chamando "UpdateDBTables(/home/ubuntu/my_folder/my_software/configs/pipelines/8/subcat7-my_file_8.json)" _id:u0vaoHcBZfIv8Wwq-hrd _type:_doc _index:log_6 _score:0
  • 第三:

log_6.log

2021-01-13 12:41:17,756 luigi-logger[11977] INFO: *****> Chamando "UpdateDBTables(/home/ubuntu/my_folder/my_software/configs/pipelines/7/subcat7-my_file_7.json)"
2021-01-13 12:41:17,756 luigi-logger[11977] INFO: *****> Chamando "UpdateDBTables(/home/ubuntu/my_folder/my_software/configs/pipelines/8/subcat7-my_file_8.json)"
2021-01-13 12:41:17,756 luigi-logger[11977] INFO: *****> Chamando "UpdateDBTables(/home/ubuntu/my_folder/my_software/configs/pipelines/9/subcat9-my_file_9.json)"

在 kibana 的 Discover 选项中:5 次点击

crawler_category:7 host:fe299a799115 execution_date:Jan 12, 2021 @ 21:00:00.000 @version:1 path:/usr/share/logs_data/log_6.log @timestamp:Feb 14, 2021 @ 11:02:32.968 crawler_subcategory:subcat7 crawler_name:my_file_7 execution_time:12:41:17 message:2021-01-13 12:41:17,756 luigi-logger[11977] INFO: *****> Chamando "UpdateDBTables(/home/ubuntu/my_folder/my_software/configs/pipelines/7/subcat7-my_file_7.json)" _id:uEvZoHcBZfIv8WwqqBlB _type:_doc _index:log_6 _score:0
crawler_category:7 host:fe299a799115 execution_date:Jan 12, 2021 @ 21:00:00.000 @version:1 path:/usr/share/logs_data/log_6.log @timestamp:Feb 14, 2021 @ 11:04:01.510 crawler_subcategory:subcat7 crawler_name:my_file_7 execution_time:12:41:17 message:2021-01-13 12:41:17,756 luigi-logger[11977] INFO: *****> Chamando "UpdateDBTables(/home/ubuntu/my_folder/my_software/configs/pipelines/7/subcat7-my_file_7.json)" _id:ukvaoHcBZfIv8Wwq-hrX _type:_doc _index:log_6 _score:0
crawler_category:8 host:fe299a799115 execution_date:Jan 12, 2021 @ 21:00:00.000 @version:1 path:/usr/share/logs_data/log_6.log @timestamp:Feb 14, 2021 @ 11:04:01.515 crawler_subcategory:subcat7 crawler_name:my_file_8 execution_time:12:41:17 message:2021-01-13 12:41:17,756 luigi-logger[11977] INFO: *****> Chamando "UpdateDBTables(/home/ubuntu/my_folder/my_software/configs/pipelines/8/subcat7-my_file_8.json)" _id:u0vaoHcBZfIv8Wwq-hrd _type:_doc _index:log_6 _score:0
crawler_category:9 host:fe299a799115 execution_date:Jan 12, 2021 @ 21:00:00.000 @version:1 path:/usr/share/logs_data/log_6.log @timestamp:Feb 14, 2021 @ 11:04:33.595 crawler_subcategory:subcat9 crawler_name:my_file_9 execution_time:12:41:17 message:2021-01-13 12:41:17,756 luigi-logger[11977] INFO: *****> Chamando "UpdateDBTables(/home/ubuntu/my_folder/my_software/configs/pipelines/9/subcat9-my_file_9.json)" _id:PEvboHcBZfIv8WwqeBsm _type:_doc _index:log_6 _score:0
crawler_category:8 host:fe299a799115 execution_date:Jan 12, 2021 @ 21:00:00.000 @version:1 path:/usr/share/logs_data/log_6.log @timestamp:Feb 14, 2021 @ 11:04:33.595 crawler_subcategory:subcat7 crawler_name:my_file_8 execution_time:12:41:17 message:2021-01-13 12:41:17,756 luigi-logger[11977] INFO: *****> Chamando "UpdateDBTables(/home/ubuntu/my_folder/my_software/configs/pipelines/8/subcat7-my_file_8.json)" _id:PUvboHcBZfIv8WwqeBsn _type:_doc _index:log_6 _score:0

为什么每次我用新的日志行索引模式更新 log_6.log 文件时都会得到旧条目的命中?例如,为什么在第三个 log 更新中它再次读取日志的第二行?

更新:这是logstash log

Attaching to docker-elk_logstash_1
logstash_1       | Using bundled JDK: /usr/share/logstash/jdk
logstash_1       | OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
logstash_1       | WARNING: An illegal reflective access operation has occurred
logstash_1       | WARNING: Illegal reflective access by org.jruby.ext.openssl.SecurityHelper (file:/tmp/jruby-1/jruby2302152547405294616jopenssl.jar) to field java.security.MessageDigest.provider
logstash_1       | WARNING: Please consider reporting this to the maintainers of org.jruby.ext.openssl.SecurityHelper
logstash_1       | WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
logstash_1       | WARNING: All illegal access operations will be denied in a future release
logstash_1       | Sending Logstash logs to /usr/share/logstash/logs which is now configured via log4j2.properties
logstash_1       | [2021-02-14T14:02:20,574][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.10.2", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 11.0.8+10 on 11.0.8+10 +indy +jit [linux-x86_64]"}
logstash_1       | [2021-02-14T14:02:20,651][INFO ][logstash.setting.writabledirectory] Creating directory {:setting=>"path.queue", :path=>"/usr/share/logstash/data/queue"}
logstash_1       | [2021-02-14T14:02:20,667][INFO ][logstash.setting.writabledirectory] Creating directory {:setting=>"path.dead_letter_queue", :path=>"/usr/share/logstash/data/dead_letter_queue"}
logstash_1       | [2021-02-14T14:02:21,471][INFO ][logstash.agent           ] No persistent UUID file found. Generating new UUID {:uuid=>"0c9439e7-3c67-4b5d-90f4-a99008772803", :path=>"/usr/share/logstash/data/uuid"}
logstash_1       | [2021-02-14T14:02:22,115][WARN ][deprecation.logstash.monitoringextension.pipelineregisterhook] Internal collectors option for Logstash monitoring is deprecated and targeted for removal in the next major version.
logstash_1       | Please configure Metricbeat to monitor Logstash. Documentation can be found at: 
logstash_1       | https://www.elastic.co/guide/en/logstash/current/monitoring-with-metricbeat.html
logstash_1       | [2021-02-14T14:02:23,391][WARN ][deprecation.logstash.outputs.elasticsearch] Relying on default value of `pipeline.ecs_compatibility`, which may change in a future major release of Logstash. To avoid unexpected changes when upgrading Logstash, please explicitly declare your desired ECS Compatibility mode.
logstash_1       | [2021-02-14T14:02:24,743][INFO ][logstash.licensechecker.licensereader] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://elastic:xxxxxx@elasticsearch:9200/]}}
logstash_1       | [2021-02-14T14:02:25,089][WARN ][logstash.licensechecker.licensereader] Restored connection to ES instance {:url=>"http://elastic:xxxxxx@elasticsearch:9200/"}
logstash_1       | [2021-02-14T14:02:25,202][INFO ][logstash.licensechecker.licensereader] ES Output version determined {:es_version=>7}
logstash_1       | [2021-02-14T14:02:25,210][WARN ][logstash.licensechecker.licensereader] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>7}
logstash_1       | [2021-02-14T14:02:25,463][INFO ][logstash.monitoring.internalpipelinesource] Monitoring License OK
logstash_1       | [2021-02-14T14:02:25,465][INFO ][logstash.monitoring.internalpipelinesource] Validated license for monitoring. Enabling monitoring pipeline.
logstash_1       | [2021-02-14T14:02:28,676][INFO ][org.reflections.Reflections] Reflections took 49 ms to scan 1 urls, producing 23 keys and 47 values 
logstash_1       | [2021-02-14T14:02:29,562][WARN ][deprecation.logstash.outputs.elasticsearchmonitoring] Relying on default value of `pipeline.ecs_compatibility`, which may change in a future major release of Logstash. To avoid unexpected changes when upgrading Logstash, please explicitly declare your desired ECS Compatibility mode.
logstash_1       | [2021-02-14T14:02:29,798][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://elastic:xxxxxx@elasticsearch:9200/]}}
logstash_1       | [2021-02-14T14:02:29,800][INFO ][logstash.outputs.elasticsearchmonitoring][.monitoring-logstash] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://elastic:xxxxxx@elasticsearch:9200/]}}
logstash_1       | [2021-02-14T14:02:29,881][WARN ][logstash.outputs.elasticsearchmonitoring][.monitoring-logstash] Restored connection to ES instance {:url=>"http://elastic:xxxxxx@elasticsearch:9200/"}
logstash_1       | [2021-02-14T14:02:29,892][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"http://elastic:xxxxxx@elasticsearch:9200/"}
logstash_1       | [2021-02-14T14:02:29,923][INFO ][logstash.outputs.elasticsearchmonitoring][.monitoring-logstash] ES Output version determined {:es_version=>7}
logstash_1       | [2021-02-14T14:02:29,924][WARN ][logstash.outputs.elasticsearchmonitoring][.monitoring-logstash] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>7}
logstash_1       | [2021-02-14T14:02:29,930][INFO ][logstash.outputs.elasticsearch][main] ES Output version determined {:es_version=>7}
logstash_1       | [2021-02-14T14:02:29,931][WARN ][logstash.outputs.elasticsearch][main] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>7}
logstash_1       | [2021-02-14T14:02:30,033][INFO ][logstash.outputs.elasticsearchmonitoring][.monitoring-logstash] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearchMonitoring", :hosts=>["http://elasticsearch:9200"]}
logstash_1       | [2021-02-14T14:02:30,059][WARN ][logstash.javapipeline    ][.monitoring-logstash] 'pipeline.ordered' is enabled and is likely less efficient, consider disabling if preserving event order is not necessary
logstash_1       | [2021-02-14T14:02:30,069][INFO ][logstash.outputs.elasticsearch][main] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//elasticsearch:9200"]}
logstash_1       | [2021-02-14T14:02:30,254][INFO ][logstash.outputs.elasticsearch][main] Using a default mapping template {:es_version=>7, :ecs_compatibility=>:disabled}
logstash_1       | [2021-02-14T14:02:30,344][INFO ][logstash.outputs.elasticsearch][main] Attempting to install template {:manage_template=>{"index_patterns"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s", "number_of_shards"=>1}, "mappings"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}
logstash_1       | [2021-02-14T14:02:30,367][INFO ][logstash.javapipeline    ][.monitoring-logstash] Starting pipeline {:pipeline_id=>".monitoring-logstash", "pipeline.workers"=>1, "pipeline.batch.size"=>2, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>2, "pipeline.sources"=>["monitoring pipeline"], :thread=>"#<Thread:0x524db6c0 run>"}
logstash_1       | [2021-02-14T14:02:30,537][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["/usr/share/logstash/pipeline/logstash.conf"], :thread=>"#<Thread:0x7719a78e@/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:54 run>"}
logstash_1       | [2021-02-14T14:02:31,467][INFO ][logstash.javapipeline    ][.monitoring-logstash] Pipeline Java execution initialization time {"seconds"=>1.1}
logstash_1       | [2021-02-14T14:02:31,528][INFO ][logstash.javapipeline    ][main] Pipeline Java execution initialization time {"seconds"=>0.98}
logstash_1       | [2021-02-14T14:02:31,575][INFO ][logstash.javapipeline    ][.monitoring-logstash] Pipeline started {"pipeline.id"=>".monitoring-logstash"}
logstash_1       | [2021-02-14T14:02:31,904][INFO ][logstash.inputs.file     ][main] No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"/usr/share/logstash/data/plugins/inputs/file/.sincedb_4665540243166de885448baafb9de578", :path=>["/usr/share/logs_data/log_6.log"]}
logstash_1       | [2021-02-14T14:02:31,941][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
logstash_1       | [2021-02-14T14:02:32,064][INFO ][logstash.agent           ] Pipelines running {:count=>2, :running_pipelines=>[:".monitoring-logstash", :main], :non_running_pipelines=>[]}
logstash_1       | [2021-02-14T14:02:32,088][INFO ][filewatch.observingtail  ][main][1564967280b5861f1e98faa762e5f84d80b5a693bf98ecd54f18d1bcfc26ea2d] START, creating Discoverer, Watch with file and sincedb collections
logstash_1       | [2021-02-14T14:02:32,906][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

更新 2:此外,这是我的 sincedb 文件,在附加了三个日志行之后:

18616173 0 2050 486 1613311473.59608 /usr/share/logs_data/log_6.log
18616174 0 2050 324 1613311441.515639

如果我理解正确的话,它是正确的,因为 324 是上次文件更新前的字符数,486 是文件更新后的字符数。

我还检查了 localhost 端口中的弹性用户界面,索引 log_6,在 Index Management 会话中:

enter image description here

说 5 作为 Docs Count 值,这让我思考数据是否以正确的方式到达 Kibana...

0 个答案:

没有答案