在Druid中上传木地板文件

时间:2018-09-10 13:00:35

标签: parquet druid

我是新来的德鲁伊。 我已经在本地完成了druid的本地设置,并且能够在druid中加载json数据文件。

但是,当我尝试上载实木复合地板文件时,它给出了意外字符异常。

我已经安装了Parquet和Avro扩展,在每种情况下都会出现以下错误

$> curl -X'POST'-H'Content-Type:application / json'-d @ examples / wikipedia_hadoop_parquet_job.json http://localhost:8090/druid/indexer/v1/task

<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 500 </title>
</head>enter code here
<body>
<h2>HTTP ERROR: 500</h2>
<p>Problem accessing /druid/indexer/v1/task. Reason:
<pre>    javax.servlet.ServletException: com.fasterxml.jackson.core.JsonParseException: Unexpected character (&apos;}&apos; (code 125)): was expecting double-quote to start field name
 at [Source: HttpInputOverHTTP@57df7615[c=2032,q=1,[0]=EOF,s=STREAM]; line: 1, column: 493]</pre></p>
<hr /><a href="http://eclipse.org/jetty">Powered by Jetty:// 9.3.19.v20170502</a><hr/>
</body>
</html>

下面是JSON配置文件

{
  "type": "index_hadoop",
  "spec": {
    "ioConfig": {
      "type": "hadoop",
      "inputSpec": {
        "type": "static",
        "inputFormat": "org.apache.druid.data.input.parquet.DruidParquetInputFormat",
        "paths": "example/wikipedia_list.parquet"
      },
      "metadataUpdateSpec": {
        "type": "postgresql",
        "connectURI": "jdbc:postgresql://localhost/druid",
        "user" : "druid",
        "password" : "asdf",
        "segmentTable": "druid_segments"
      },

    },
    "dataSchema": {
      "dataSource": "wikipedia",
      "parser": {
        "type": "parquet",
        "parseSpec": {
          "format": "timeAndDims",
          "timestampSpec": {
            "column": "timestamp",
            "format": "auto"
          },
          "dimensionsSpec": {
            "dimensions": [
              "page",
              "language",
              "user",
              "unpatrolled"
            ],
            "dimensionExclusions": [],
            "spatialDimensions": []
          }
        }
      },
      "metricsSpec": [{
        "type": "count",
        "name": "count"
      }, {
        "type": "doubleSum",
        "name": "deleted",
        "fieldName": "deleted"
      }, {
        "type": "doubleSum",
        "name": "delta",
        "fieldName": "delta"
      }],
      "granularitySpec": {
        "type": "uniform",
        "segmentGranularity": "DAY",
        "queryGranularity": "NONE",
        "intervals": ["2013-08-30/2013-09-02"]
      }
    },
    "tuningConfig": {
      "type": "hadoop",
      "workingPath": "tmp/working_path",
      "partitionsSpec": {
        "targetPartitionSize": 5000000
      },
      "leaveIntermediate": true
    }
  }
}

我无法找出问题所在。让我知道,如果我缺少什么。

0 个答案:

没有答案