Druid Kafka摄取的配置

时间:2019-02-05 07:46:53

标签: apache-kafka druid

我想设置kafka druid接收,但是即使在common.runtime.properties中配置并添加druid-kafka索引服务后,它仍然给我错误。这个你能帮我吗。我的数据是csv格式。

{
"type": "kafka",
"spec": {
    "dataSchema": {
        "dataSource": "london_crime_by_lsoa",
        "parser": {
            "type": "string",
            "parseSpec": {
                "format": "csv",
                "dimensionsSpec": {
                    "dimensions": [
                        "lsoa_code",
                        "borough",
                        "major_category",
                        "minor_category",
                        {
                            "name": "value",
                            "type": "long"
                        },
                        {
                            "name": "year",
                            "type": "long"
                        },
                        {
                            "name": "month",
                            "type": "long"
                        }
                    ]
                },
                "timestampSpec": {
                    "column": "year",
                    "format": "auto"
                },
                "columns": [
                    "lsoa_code",
                    "borough",
                    "major_category",
                    "minor_category",
                    "value",
                    "year",
                    "month"
                ]
            }
        },
        "metricsSpec": [],
        "granularitySpec": {
            "type": "uniform",
            "segmentGranularity": "year",
            "queryGranularity": "NONE",
            "rollup": false
        }
    },
    "ioConfig": {
        "topic": "london_crime_by_lsoa",
        "taskDuration": "PT10M",
        "useEarliestOffset": "true",
        "consumerProperties": {
            "bootstrap.servers": "localhost:9092"
        }
    },
    "tuningConfig": {
        "type": "kafka",
        "maxRowsPerSegment": 500000
    }
}

}

执行此命令后:

   curl -XPOST -H'Content-Type: application/json' -d @quickstart/tutorial/crime_supervisor.json http://localhost:8090/druid/indexer/v1/supervisor

我遇到此错误:

{"error":"Instantiation of [simple type, class org.apache.druid.indexing.kafka.supervisor.KafkaSupervisorSpec] value failed: dataSchema"}

1 个答案:

答案 0 :(得分:0)

我认为您在JSON中指定规范的方式存在问题。 您必须直接在JSON中指定public List<Tasks> multipleFilter(String PriorityFilter, String startDateFilter, String endDateFilter, List<Tasks> listAllTasks) { List<Tasks> listTasksAfterFiltering = new ArrayList<>(); for (Tasks task_obj : listAllTasks) { String PriorityTask = task_obj.getPriority(); String startDateTask = task_obj.getStartDate(); String endDateTask = task_obj.getEndDate(); if (PriorityFilter.equals(PriorityTask) || PriorityFilter.isEmpty()) if (startDateFilter.equals(startDateTask) || startDateFilter.isEmpty()) if (endDateFilter.equals(endDateTask) || endDateFilter.isEmpty()) if (!PriorityFilter.isEmpty() || !startDateFilter.isEmpty() || !endDateFilter.isEmpty()) { listTasksAfterFiltering.add(task_obj); } } return listTasksAfterFiltering; } ,而不是dataSchema的子属性。

这是您应遵循的格式:

spec