使用CSV

时间:2017-02-13 17:06:26

标签: csv parsing logstash

我在解析和计算csv中的性能导航定时数据时遇到了问题。

我能够解析字段但不确定如何正确地进行计算(如下)。要牢记的一些要点:

数据集按粗体值分组(这是21个数据点的时间点 ACMEPage-1486643427973,unloadEventEnd,1486643372422 2.需要使用组内的数据点进行计算 我假设有一些标记和分组需要完成,但我对如何实现它没有明确的愿景。任何帮助将不胜感激。

谢谢,

---------------计算-----------------

  
      
  • 总第一个字节时间= responseStart - navigationStart
  •   
  • 延迟= responseStart - fetchStart
  •   
  • DNS /域查找时间= domainLookupEnd - domainLookupStart
  •   
  • 服务器连接时间= connectEnd - connectStart
  •   
  • 服务器响应时间= responseStart - requestStart
  •   
  • 页面加载时间= loadEventStart - navigationStart
  •   
  • 传输/页面下载时间= responseEnd - responseStart
  •   
  • DOM Interactive Time = domInteractive - navigationStart
  •   
  • DOM内容加载时间= domContentLoadedEventEnd - navigationStart
  •   
  • DOM处理到交互式= domInteractive - domLoading
  •   
  • DOM Interactive to Complete = domComplete - domInteractive
  •   
  • Onload = loadEventEnd - loadEventStart
  •   

------- CSV中的数据-----------

  

ACMEPage-1486643427973,unloadEventEnd,1486643372422   ACMEPage-1486643427973,responseEnd,1486643372533   ACMEPage-1486643427973,responseStart,1486643372416   ACMEPage-1486643427973,domInteractive,1486643373030   ACMEPage-1486643427973,domainLookupEnd,1486643372194   ACMEPage-1486643427973,unloadEventStart,1486643372422   ACMEPage-1486643427973,domComplete,1486643373512   ACMEPage-1486643427973,domContentLoadedEventStart,1486643373030   ACMEPage-1486643427973,domainLookupStart,1486643372194   ACMEPage-1486643427973,redirectEnd,0   ACMEPage-1486643427973,redirectStart,0   ACMEPage-1486643427973,connectEnd,1486643372194   ACMEPage-1486643427973,的toJSON,{}   ACMEPage-1486643427973,connectStart,1486643372194   ACMEPage-1486643427973,loadEventStart,1486643373512   ACMEPage-1486643427973,navigationStart,1486643372193   ACMEPage-1486643427973,requestStart,1486643372203   ACMEPage-1486643427973,secureConnectionStart,0   ACMEPage-1486643427973,fetchStart,1486643372194   ACMEPage-1486643427973,domContentLoadedEventEnd,1486643373058   ACMEPage-1486643427973,domLoading,1486643372433   ACMEPage-1486643427973,loadEventEnd,1486643373514

----------输出---------------

 "path" =>    "/Users/philipp/Downloads/build2/logDataPoints_com.concur.automation.cge.ui.admin.ADCLookup_1486643340910.csv",
 "@timestamp" => 2017-02-09T12:29:57.763Z,
"navigationTimer" => "connectStart",
 "@version" => "1",
 "host" => "15mbp-09796.local",
 "elapsed_time" => "1486643372194",
 "pid" => "1486643397763",
 "page" => "ADCLookupDataPage",
"message" => "ADCLookupDataPage-1486643397763,connectStart,1486643372194",
 "type" => "csv"
 }

-------------- ---------------- logstash.conf

input {
file {
type => "csv"
path => "/Users/path/logDataPoints_com.concur.automation.acme.ui.admin.acme_1486643340910.csv"
start_position => beginning
# to read from the beginning of file
sincedb_path => "/dev/null"
}
}

filter {
csv {
columns => ["page_id", "navigationTimer", "elapsed_time"]
}

if (["elapsed_time"] == "{}" ) {
drop{}
}
else {
grok {
match => { "page_id" => "%{WORD:page}-%{INT:pid}"
}

remove_field => [ "page_id" ]
}
}

date {
match => [ "pid", "UNIX_MS" ]
target => "@timestamp" 
}
}

output {
elasticsearch { hosts => ["localhost:9200"] }
stdout { codec => rubydebug }
}

1 个答案:

答案 0 :(得分:0)

以下是我的数据趋势:

- 我发现更容易转动数据,而不是沿着列向下移动,以使数据沿着每个行的行进行#34;事件"或"文件"

- 每个字段需要相应地映射为整数或字符串

  • 一旦数据在Kibana中正确使用ruby代码过滤器进行简单的数学计算就会出现问题,所以我最终使用了#34;脚本字段"在Kibana进行计算。

    input {
        file {
           type => "csv"
        path => "/Users/philipp/perf_csv_pivot2.csv"
        start_position => beginning
        # to read from the beginning of file
        sincedb_path => "/dev/null"
      }
    }
    
    
    filter {
        csv {
            columns => ["page_id","unloadEventEnd","responseEnd","responseStart","domInteractive","domainLookupEnd","unloadEventStart","domComplete","domContentLoadedEventStart","domainLookupstart","redirectEnd","redirectStart","connectEnd","toJSON","connectStart","loadEventStart","navigationStart","requestStart","secureConnectionStart","fetchStart","domContentLoadedEventEnd","domLoading","loadEventEnd"]
        }
    

    grok { match => {" page_id" => "%{WORD:页} - %{INT:page_ts}" } remove_field => [" page_id"," message"," path" ] }

    mutate {
        convert => { "unloadEventEnd" => "integer" }
        convert => { "responseEnd" => "integer" }
        convert => { "responseStart" => "integer" }
        convert => { "domInteractive" => "integer" }
        convert => { "domainLookupEnd" => "integer" }
        convert => { "unloadEventStart" => "integer" }
        convert => { "domComplete" => "integer" }
        convert => { "domContentLoadedEventStart" => "integer" }
        convert => { "domainLookupstart" => "integer" }
        convert => { "redirectEnd" => "integer" }
        convert => { "redirectStart" => "integer" }
        convert => { "connectEnd" => "integer" }
        convert => { "toJSON" => "string" }
        convert => { "connectStart" => "integer" }
        convert => { "loadEventStart" => "integer" }
        convert => { "navigationStart" => "integer" }
        convert => { "requestStart" => "integer" }
        convert => { "secureConnectionStart" => "integer" }
        convert => { "fetchStart" => "integer" }
        convert => { "domContentLoadedEventEnd" => "integer" }
        convert => { "domLoading" => "integer" }
        convert => { "loadEventEnd" => "integer" }
    
      }
    
    
       date {
           match => [ "page_ts", "UNIX_MS" ]
            target => "@timestamp"
            remove_field => [ "page_ts", "timestamp", "host", "toJSON" ]
            }
        }
    output {
      elasticsearch { hosts => ["localhost:9200"] }
      stdout { codec => rubydebug }
      }
    

希望这可以帮助别人,