我正在评估Esper作为计费数据无损处理的系统。预计系统每秒可处理约20000个事件,并使用连续聚合运行约400个语句(不将事件存储在内存中)。 为了获得预期的性能,我开始在多个线程中发送事件,并发现esper经常会丢失数据。
显示数据丢失的简单示例
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.stream.Collectors;
import com.espertech.esper.client.Configuration;
import com.espertech.esper.client.EPAdministrator;
import com.espertech.esper.client.EPRuntime;
import com.espertech.esper.client.EPServiceProvider;
import com.espertech.esper.client.EPServiceProviderManager;
import com.espertech.esper.client.EPStatement;
public class Example {
public static void main(String[] args) throws Exception {
new Example().run();
}
public void run() throws Exception {
Configuration config = new Configuration();
// use default configuration
EPServiceProvider epService = EPServiceProviderManager.getDefaultProvider(config);
EPAdministrator epAdministrator = epService.getEPAdministrator();
// simple schema
epAdministrator.getConfiguration().addEventType(LogLine.class);
// event for terminating context partition
createEPL(epAdministrator, "create schema TerminateEvent() ");
// Start context partition on LogLine event and terminate on TerminateEvent.
createEPL(epAdministrator, "create context InitCtx start LogLine end TerminateEvent");
// select to collect count of events per account_name.
EPStatement statement = createEPL(epAdministrator, "context InitCtx select context.id as partition_id, count(*), sum(bytes) from LogLine output last when terminated");
// register listener to output all newEvents properties values
statement.addListener((newEvents, oldEvents) -> {
String resultEvents = Arrays.stream(newEvents).map((event) -> {
return Arrays.stream(event.getEventType().getPropertyNames()).map((prop) -> {
return prop + "=" + event.get(prop);
}).collect(Collectors.joining(", "));
}).collect(Collectors.joining("]; ["));
System.out.println("=== results: [" + resultEvents + "]");
});
//lets use 4 threads for sending data
ExecutorService myexecutor = Executors.newFixedThreadPool(4);
List<CompletableFuture<Void>> listOfTasks = new ArrayList<>();
//get data to be processed
List<LogLine> list = getData();
for (int i = 1; i <= list.size(); i++) {
//concurrently send each logline
final LogLine logLine = list.get(i - 1);
CompletableFuture<Void> task = CompletableFuture.runAsync(() -> {
epService.getEPRuntime().sendEvent(logLine);
System.out.println("== sending data " + logLine);
}, myexecutor);
listOfTasks.add(task);
if (i % 4 == 0) {
// terminate context partition after every 4 events.
sendTerminateEvent(listOfTasks, epService.getEPRuntime());
}
}
// terminate context partition at the end of the execution.
sendTerminateEvent(listOfTasks, epService.getEPRuntime());
// shutdow all services.
myexecutor.shutdown();
epService.destroy();
}
private void sendTerminateEvent(List<CompletableFuture<Void>> listOfTasks, EPRuntime epRuntime) throws Exception {
// wait for all submitted tasks to finish
CompletableFuture[] array = listOfTasks.toArray(new CompletableFuture[listOfTasks.size()]);
CompletableFuture.allOf(array).get(1, TimeUnit.MINUTES);
listOfTasks.clear();
System.out.println("== sending terminate event.");
// send partition termination event
epRuntime.sendEvent(Collections.emptyMap(), "TerminateEvent");
}
private List<LogLine> getData() {
List<LogLine> dataEventsList = new ArrayList<>();
dataEventsList.add(new LogLine(0, 1));
dataEventsList.add(new LogLine(0, 2));
dataEventsList.add(new LogLine(0, 3));
dataEventsList.add(new LogLine(0, 4));
dataEventsList.add(new LogLine(0, 5));
dataEventsList.add(new LogLine(1, 1));
dataEventsList.add(new LogLine(1, 2));
dataEventsList.add(new LogLine(1, 3));
dataEventsList.add(new LogLine(1, 4));
dataEventsList.add(new LogLine(1, 5));
return dataEventsList;
}
private EPStatement createEPL(EPAdministrator admin, String statement) {
System.out.println("creating EPL: " + statement);
return admin.createEPL(statement);
}
public static class LogLine {
int account_id;
int bytes;
public LogLine(int account_id, int bytes) {
this.account_id = account_id;
this.bytes = bytes;
}
public int getAccount_id() {
return account_id;
}
public int getBytes() {
return bytes;
}
@Override
public String toString() {
return "[account_id=" + account_id + ", bytes=" + bytes + "]";
}
}
}
执行输出:
creating EPL: create schema TerminateEvent()
creating EPL: create context InitCtx start LogLine end TerminateEvent
creating EPL: context InitCtx select context.id as partition_id, count(*), sum(bytes) from LogLine output last when terminated
== data [account_id=0, bytes=3] was send
== data [account_id=0, bytes=1] was send
== data [account_id=0, bytes=4] was send
== data [account_id=0, bytes=2] was send
== sending terminate event.
=== results: [partition_id=0, count(*)=4, sum(bytes)=10]
== data [account_id=1, bytes=2] was send
== data [account_id=1, bytes=3] was send
== data [account_id=0, bytes=5] was send
== data [account_id=1, bytes=1] was send
== sending terminate event.
=== results: [partition_id=1, count(*)=2, sum(bytes)=6]
== data [account_id=1, bytes=5] was send
== data [account_id=1, bytes=4] was send
== sending terminate event.
=== results: [partition_id=2, count(*)=1, sum(bytes)=4]
第一个分区有正确的结果,接下来的两个分区输出无效结果:
// OK
actual [partition_id=0, count(*)=4, sum(bytes)=10]
expected [partition_id=0, count(*)=4, sum(bytes)=10]
// LOSS
actual [partition_id=1, count(*)=2, sum(bytes)=6]
expected [partition_id=1, count(*)=4, sum(bytes)=11]
// LOSS
actual [partition_id=2, count(*)=1, sum(bytes)=4]
expected [partition_id=2, count(*)=2, sum(bytes)=9]
此示例代码有什么问题?
启用优先级执行顺序没有帮助
creating EPL: create schema TerminateEvent()
creating EPL: @Priority(1) create context InitCtx start LogLine end TerminateEvent
creating EPL: @Priority(0) context InitCtx select context.id as partition_id, count(*), sum(bytes) from LogLine output last when terminated
== data [account_id=0, bytes=3] was send
== data [account_id=0, bytes=4] was send
== data [account_id=0, bytes=1] was send
== data [account_id=0, bytes=2] was send
== sending terminate event.
=== results: [partition_id=0, count(*)=4, sum(bytes)=10]
== data [account_id=1, bytes=2] was send
== data [account_id=1, bytes=3] was send
== data [account_id=0, bytes=5] was send
== data [account_id=1, bytes=1] was send
== sending terminate event.
=== results: [partition_id=1, count(*)=2, sum(bytes)=6]
== data [account_id=1, bytes=5] was send
== data [account_id=1, bytes=4] was send
== sending terminate event.
=== results: [partition_id=2, count(*)=1, sum(bytes)=4]
答案 0 :(得分:0)
这个问题是Esper data loss when inbound threading is enabled
的更复杂的重复在Esper EPL需要有序执行的情况下,您必须开发代码,以便以有序的方式处理事件。 Esper无法神奇地执行某些排序。 JVM可以随时暂停任何线程。您必须正确设计代码。
例如,假设您有2个线程。让我们假设A可以并行处理,B必须按照下面例子中提供的顺序进行处理。
假设您有事件进来。您希望B在A1和A2之后但在A3和A4之前处理:
A1 A2 B1 A3 A4
如果您只是将所有A和B事件添加到队列和线程池中,并说5个线程,这意味着可以先处理B,中间或最后处理B.由于JVM不强制执行订单,因此每次运行都可以获得不同的结果。 Esper无法强制执行订单,因为您的应用程序驱动Esper而不是相反。
例如,您可以做的是将第一组A事件添加到队列(A1,A2)。当B进来时,等待队列清空。接下来将B添加到队列中。等待B完成。然后将下一组A事件(A3,A4)添加到队列中。因此,您可以实现与A和B相关的有序处理,并且所有A事件都是并行处理的。
CORRECTION:
我现在看到你只有一个事件类型而没有A + B.在这种情况下,请确保您运行的是最新版本。还要确保“create context”不会获得较低的优先级,否则最后会创建上下文分区。我已经运行了大约10次代码,并且没有看到7.1.0的无效输出。我在JDK 1.8.0_121(Oracle)上。