我正在尝试将数据从Postgresql服务器迁移到elasticsearch。 postgres数据为JSONB格式。在上河时,出现以下错误。
[INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600} [2019-01-07T14:22:34,625][INFO ][logstash.inputs.jdbc ] (0.128981s) SELECT to_json(details) from inventory.retailer_products1 limit 1 [2019-01-07T14:22:35,099][WARN ][logstash.inputs.jdbc ] Exception when executing JDBC query {:exception=>#<Sequel::DatabaseError: Java::OrgLogstash::MissingConverterException: Missing Converter handling for full class name=org.postgresql.util.PGobject, simple name=PGobject>} [2019-01-07T14:22:36,568][INFO ][logstash.pipeline ] Pipeline has terminated {:pipeline_id=>"main", :thread=>"#<Thread:0x6067806f run>"}
我认为logstash无法识别JSON数据类型。
下面是我的logstash conf文件
input {
jdbc {
jdbc_connection_string => "jdbc:postgresql://localhost:5432/mydb"
jdbc_user => "postgres"
jdbc_password => "password"
jdbc_validate_connection => true
jdbc_driver_library => "/home/dell5/Downloads/postgresql-9.4.1208.jar"
jdbc_driver_class => "org.postgresql.Driver"
statement => "SELECT to_json(details) from inventory.retailer_products1 limit 1"
}
}
filter{
json{
source => "to_json"
}
}
output {
elasticsearch {
index => "products-retailer"
document_type => "mapping-retailer"
hosts => "localhost"
}
stdout{}
}
为此我定义的映射如下
{
"products-retailer": {
"mappings": {
"mapping-retailer": {
"dynamic": "false",
"properties": {
"category": {
"type": "keyword"
},
"id": {
"type": "keyword"
},
"products": {
"type": "nested",
"properties": {
"barcode": {
"type": "text"
},
"batchno": {
"type": "text"
},
"desc": {
"type": "text"
},
"expirydate": {
"type": "date",
"format": "YYYY-MM-DD"
},
"imageurl": {
"type": "text"
},
"manufaturedate": {
"type": "date",
"format": "YYYY-MM-DD"
},
"mrp": {
"type": "text"
},
"name": {
"type": "text",
"fields": {
"ngrams": {
"type": "text",
"analyzer": "autocomplete"
}
}
},
"openingstock": {
"type": "text"
},
"price": {
"type": "text"
},
"purchaseprice": {
"type": "text"
},
"sku": {
"type": "text"
},
"unit": {
"type": "text"
}
}
},
"retailerid": {
"type": "keyword"
},
"subcategory": {
"type": "keyword"
}
}
}
}
}
}
postgres列中的示例数据如下。它嵌套了我在elasticsearch映射中定义的json。
{
"id": "",
"Category": "Bread and Biscuits",
"products": {
"MRP": "45",
"SKU": "BREAD-1",
"Desc": "Brown Bread",
"Name": "Brown Bread",
"Unit": "Packets",
"Brand": "Britannia",
"Price": "40",
"BarCode": "1234567890",
"BatchNo": "456789",
"ImageUrl": "buscuits.jpeg",
"ExpiryDate": "2019-06-01",
"OpeningStock": "56789",
"PurchasePrice": "30",
"ManufactureDate": "2018-11-01"
},
"RetailerId": "1",
"SubCategory": "Bread"
}
请建议我在这里缺少什么,如果这样做是正确的方法。
我正在使用Elasticsearch 6.5.1。 PostgreSQL 9.5。
答案 0 :(得分:0)
我今天遇到了相同的错误,看来Logstash无法转换Postgres json或jsonb数据字段(PgObject)。
我将对象字段转换为TEXT数据类型,它停止对我大喊大叫并开始提取数据。至于索引是否正确,还有待观察。
如果我有更好的主意,那么我将更新我的回答,如果这是正确的方法。
答案 1 :(得分:0)
PGObject不具有转换来自to_json
方法的json的功能。使用内部转换将jsonobject转换为这样的文本。
从清单.retailer_products1限制1中选择to_json(details):: text。
现在您可以在logstash中解析json字符串。