Apache Drill LogRegex插件

时间:2019-02-10 22:28:44

标签: apache-drill

我正在尝试将Apache Drill与logFile Regex一起使用,但未进行配置。我尝试使用网页https://drill.apache.org/docs/logfile-plugin/的相同示例,但是在尝试保存网页时出现错误。

我尝试过:

"log" : {
      "type" : "logRegex",
      "extension" : "log",
      "regex" : "(\\d{6})\\s(\\d{2}:\\d{2}:\\d{2})\\s+(\\d+)\\s(\\w+)\\s+(.+)",
      "maxErrors": 10,
      "schema": [
        {
          "fieldName": "eventDate",
          "fieldType": "DATE",
          "format": "yyMMdd"
        },
        {
          "fieldName": "eventTime",
          "fieldType": "TIME",
          "format": "HH:mm:ss"
        },
        {
          "fieldName": "PID",
          "fieldType": "INT"
        },
        {
          "fieldName": "action"
        },
        {
          "fieldName": "query"
        }
      ]
   }

这对我来说没有太大意义,我也尝试过吗?

{
    "type": "file",
    "enabled": true,
    "connection": "file:///",
    "workspaces": {
      "root": {
        "location": "/user/max/donuts",
        "writable": false,
        "defaultInputFormat": null
       }
    },
    "formats" : {
      "json" : {
        "type" : "json"
      }
    },
"log" : {
      "type" : "logRegex",
      "extension" : "log",
      "regex" : "(\\d{6})\\s(\\d{2}:\\d{2}:\\d{2})\\s+(\\d+)\\s(\\w+)\\s+(.+)",
      "maxErrors": 10,
      "schema": [
        {
          "fieldName": "eventDate",
          "fieldType": "DATE",
          "format": "yyMMdd"
        },
        {
          "fieldName": "eventTime",
          "fieldType": "TIME",
          "format": "HH:mm:ss"
        },
        {
          "fieldName": "PID",
          "fieldType": "INT"
        },
        {
          "fieldName": "action"
        },
        {
          "fieldName": "query"
        }
      ]
   }
  }

有人可以正确配置此插件吗?

1 个答案:

答案 0 :(得分:1)

看起来您的json-config文件无效。在“ json”格式插件之后,您的“格式”键立即关闭。请仔细检查或尝试以下操作:

{
  "storage":{
    dfs: {
      type: "file",
      connection: "file:///",
      workspaces: {
        "root" : {
          location: "/",
          writable: false,
          allowAccessOutsideWorkspace: false
        },
        "tmp" : {
          location: "/tmp",
          writable: true,
          allowAccessOutsideWorkspace: false
        }
      },
      formats: {
        "log" : {
          "type" : "logRegex",
          "extension" : "log",
          "regex" : "(\\d{6})\\s(\\d{2}:\\d{2}:\\d{2})\\s+(\\d+)\\s(\\w+)\\s+(.+)",
          "maxErrors": 10,
          "schema": [
            {
              "fieldName": "eventDate",
              "fieldType": "DATE",
              "format": "yyMMdd"
            },
            {
              "fieldName": "eventTime",
              "fieldType": "TIME",
              "format": "HH:mm:ss"
            },
            {
              "fieldName": "PID",
              "fieldType": "INT"
            },
            {
              "fieldName": "action"
            },
            {
              "fieldName": "query"
            }
          ]
        }
      }
    }
  }
}