我正在使用logstash jdbc来保持mysql和elasticsearch之间的同步。它适用于一张桌子。但是现在我想为多个表做这件事。我是否需要在终端中打开多个
logstash agent -f /Users/logstash/logstash-jdbc.conf
每个都有一个select查询,或者我们有更好的方法,这样我们可以更新多个表。
我的配置文件
input {
jdbc {
jdbc_driver_library => "/Users/logstash/mysql-connector-java-5.1.39-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/database_name"
jdbc_user => "root"
jdbc_password => "password"
schedule => "* * * * *"
statement => "select * from table1"
}
}
output {
elasticsearch {
index => "testdb"
document_type => "table1"
document_id => "%{table_id}"
hosts => "localhost:9200"
}
}
答案 0 :(得分:29)
您绝对可以拥有一个包含多个jdbc
输入的配置,然后在index
输出中对document_type
和elasticsearch
进行参数化,具体取决于事件来自哪个表
input {
jdbc {
jdbc_driver_library => "/Users/logstash/mysql-connector-java-5.1.39-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/database_name"
jdbc_user => "root"
jdbc_password => "password"
schedule => "* * * * *"
statement => "select * from table1"
type => "table1"
}
jdbc {
jdbc_driver_library => "/Users/logstash/mysql-connector-java-5.1.39-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/database_name"
jdbc_user => "root"
jdbc_password => "password"
schedule => "* * * * *"
statement => "select * from table2"
type => "table2"
}
# add more jdbc inputs to suit your needs
}
output {
elasticsearch {
index => "testdb"
document_type => "%{type}" # <- use the type from each input
hosts => "localhost:9200"
}
}
答案 1 :(得分:5)
这不会创建重复数据。和兼容的logstash 6x。
# YOUR_DATABASE_NAME : test
# FIRST_TABLE : place
# SECOND_TABLE : things
# SET_DATA_INDEX : test_index_1, test_index_2
input {
jdbc {
# The path to our downloaded jdbc driver
jdbc_driver_library => "/mysql-connector-java-5.1.44-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
# Postgres jdbc connection string to our database, YOUR_DATABASE_NAME
jdbc_connection_string => "jdbc:mysql://localhost:3306/test"
# The user we wish to execute our statement as
jdbc_user => "root"
jdbc_password => ""
schedule => "* * * * *"
statement => "SELECT @slno:=@slno+1 aut_es_1, es_qry_tbl.* FROM (SELECT * FROM `place`) es_qry_tbl, (SELECT @slno:=0) es_tbl"
type => "place"
add_field => { "queryFunctionName" => "getAllDataFromFirstTable" }
use_column_value => true
tracking_column => "aut_es_1"
}
jdbc {
# The path to our downloaded jdbc driver
jdbc_driver_library => "/mysql-connector-java-5.1.44-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
# Postgres jdbc connection string to our database, YOUR_DATABASE_NAME
jdbc_connection_string => "jdbc:mysql://localhost:3306/test"
# The user we wish to execute our statement as
jdbc_user => "root"
jdbc_password => ""
schedule => "* * * * *"
statement => "SELECT @slno:=@slno+1 aut_es_2, es_qry_tbl.* FROM (SELECT * FROM `things`) es_qry_tbl, (SELECT @slno:=0) es_tbl"
type => "things"
add_field => { "queryFunctionName" => "getAllDataFromSecondTable" }
use_column_value => true
tracking_column => "aut_es_2"
}
}
# install uuid plugin 'bin/logstash-plugin install logstash-filter-uuid'
# The uuid filter allows you to generate a UUID and add it as a field to each processed event.
filter {
mutate {
add_field => {
"[@metadata][document_id]" => "%{aut_es_1}%{aut_es_2}"
}
}
uuid {
target => "uuid"
overwrite => true
}
}
output {
stdout {codec => rubydebug}
if [type] == "place" {
elasticsearch {
hosts => "localhost:9200"
index => "test_index_1_12"
#document_id => "%{aut_es_1}"
document_id => "%{[@metadata][document_id]}"
}
}
if [type] == "things" {
elasticsearch {
hosts => "localhost:9200"
index => "test_index_2_13"
document_id => "%{[@metadata][document_id]}"
# document_id => "%{aut_es_2}"
# you can set document_id . otherwise ES will genrate unique id.
}
}
}
答案 2 :(得分:0)
如果您需要在同一进程中运行多个管道,Logstash提供了一种方法,可以通过名为 pipelines.yml 的配置文件并使用多个管道
如果您当前的配置中的事件流不共享相同的输入/过滤器和输出,并且使用标签和条件相互分隔,则使用多个管道特别有用。