Logstash-通过关系数据库分组创建术语

时间:2018-06-21 16:46:53

标签: mysql elasticsearch logstash

我在MySQL中有一个表想要导入到Elasticsearch中

作为示例,数据看起来像这样

team   buyer
====   ======
one    Q76876
one    Q66567
one    T99898
two    Q45456
two    S77676

我想使用logstash将其导入elasticsearch,并创建一个看起来像这样的索引

{
  "id": "one",
  "team": one,
  "buyers": ["Q76876", "Q66567", "T99898"]
},
{
  "id": "two",
  "team": "two",
  "buyers": ["Q45456", "S77676"]
}

我该如何编写.conf脚本来实现这一目标?

1 个答案:

答案 0 :(得分:1)

Logstash将事件添加到事件中,除非您应用了一些过滤器,否则它们会将它们放入索引中。您的情况看起来很简单。如果您格式化sql查询以返回所需格式的数据,那么您无需应用任何过滤器,只需将数据库和sql查询连接起来即可在logstash配置中运行,并将输出作为弹性搜索索引即可。

例如:

MySql查询看起来像:(我不擅长mysql,下面只是提供一个想法-请验证它是否有效)

SELECT team as id, 
       team, 
       GROUP_CONCAT(DISTINCT buyer SEPARATOR ', ') as buyers
FROM tablename GROUP BY team

这将返回以下内容:

+-----+------+------------------------+
| id  | team |         buyers         |
+-----+------+------------------------+
| one | one  | Q76876, Q66567, T99898 |
| two | two  | Q45456, S77676         |
+-----+------+------------------------+

logstash配置看起来就像:

input {
  jdbc {
     jdbc_driver_library => "${DATABASE_DRIVER_PATH}"
     jdbc_driver_class => "${DATABASE_DRIVER_PATH}"
     jdbc_connection_string => "{CONNECTIONSTRING}"
     jdbc_user => "${DATABASE_USERNAME}"
     jdbc_password => "${DATABASE_PASSWORD}"
     statement_filepath => "${LOGSTASH_SQL_FILEPATH}" #this will be the sql written above
  }
}

filter {
}

output {
    elasticsearch {
        action => "index"       
        hosts => ["${ELASTICSEARCH_HOST}"]
        user => "${ELASTICSEARCH_USER}"
        password => "${ELASTICSEARCH_PASSWORD}"
        index => "${INDEX_NAME}"       
        document_type => "doc"                      
        document_id => "%{id}"       
    }
    stdout { codec => rubydebug }
    stdout { codec => dots }
}