我正在使用Logstash通过API保持我的Elasticsearch与HBase同步。
这是我的配置文件:
input {
elasticsearch {
hosts => ["<elasticsearch_ip>"]
index => "<some_name>"
type => "<some_name>"
query => '{ "query": {
"bool": {
"must_not": [
{"term": {"synced": true}}
]
}
} }'
}
}
filter {
mutate {
add_field => { "synced" => true }
}
}
output {
if [type] == "<some_name>" {
http {
format=>"json"
http_method=>"post"
url=>"http://<api-ip>/<endpoint>"
}
elasticsearch {
hosts => [<elasticsearch-ip>]
action => "update"
index => "<some_name>"
document_type => "<some_name>"
document_id => "%{document_id}"
}
}
}
我想在文档中添加synced
字段,这样我就不会在HBase中对它们进行两次索引。问题是%{document_id}
未转换为文档的实际_id
。我认为没有这样的字段,因为我尝试使用add_field => { "document_id" => "%{document_id}" }
将其添加到文档正文中并且它没有被转换。我也试过%{_id}
和%{id}
,但没有运气。我做错了什么?
注意:我听说过Watcher吗?好吧,当然,我实际上是先用它来实现的。但你听说过它的价格吗?
答案 0 :(得分:0)
您需要在USE [incentive]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[usp_insert_empincentivefinal]
(@id int,
@ConsultantName varchar(50) ,
@ClientName varchar(50) ,
@StartDate varchar(50),
@PositionName varchar(20) ,
@Location varchar(20) ,
@Job_Status varchar (20),
@RecruiterName varchar(20),
@BenchMarketing varchar(1) ,
@Placement varchar(1),
@CompanyName varchar(20),
@Durations varchar(20),
@DurationofProject varchar(10)
--@out int out
)
AS
BEGIN
SET NOCOUNT ON
BEGIN TRAN
INSERT INTO [tbl_Empincentivenew1](ConsultantName, ClientName, RecruiterName, PositionName, CompanyName, Location, DurationofProject, Durations, BenchMarketing, Placement, Job_Status, StartDate)
OUTPUT INSERTED.id
DEFAULT VALUES
COMMIT
END
GO
输入
docinfo
flag设置为true
elasticsearch
然后,您可以使用 elasticsearch {
hosts => ["<elasticsearch_ip>"]
index => "<some_name>"
type => "<some_name>"
docinfo => true
query => '{ "query": {
"bool": {
"must_not": [
{"term": {"synced": true}}
]
}
} }'
}
[@metadata][_id]