Question

我有一个应用程序，使用驼峰路由执行一些基本的ETL。每个路由都配置为从一个表中获取一些数据进行一些转换，并将其安全地放在不同模式的同一个表中。因此，骆驼路线和桌子之间存在一对一的关系。

说我有这两条路线：

from("direct:table_1").routeId(table1Route)
    .setBody("SELECT * FROM table_1)
    .to("jdbc:source_schema").split(body()).streaming()
    .process("someProcessor")
    .to("sql:INSERT INTO table_1 ... ?dataSource=target_schema");

from("direct:table_2").routeId(table2Route)
    .setBody("SELECT * FROM table_2)
    .to("jdbc:source_schema").split(body()).streaming()
    .process("someProcessor")
    .to("sql:INSERT INTO table_2 ... ?dataSource=target_schema");

在向start processing和direct:table_1端点发送direct:table_2消息时，所有内容都运行正常，数据会移动到目标架构中。

然而，查看日志，我可以看到表2记录仅在表1记录完成后才开始移动。对于我的应用程序来说，这绝对是不行的，因为有些表非常大，并且一次移动一个表需要很长时间才能运行。

我的问题是我做错了什么以及如何解决这个问题，以便数据移动并行发生。

Answer 1

我会尝试这样的事情：

from("start").multicast().parallelProcessing().to("seda:table1", "seda:table2");

基本上我有：

使用多播发送给多个收件人并使用并行处理尝试以并行方式发送到两个端点。
我已使用seda端点替换了您的直接端点。如果您不需要同步端点，则使用seda会很有用。

您还可以尝试使用.threads()语法进行多线程处理。

如果要在运行时计算表端点，可以将.multicast()替换为.recipientlist()

Answer 2

或者，如果使用xml，可以通过以下方式实现：

<routeContext id="xxxRoute" xmlns="http://camel.apache.org/schema/spring">
    <route id="xxxRouteId">
        <from uri="activemq:queue:{{xxx.queue}}" />
        <multicast parallelProcessing="true">
            <pipeline>
                <to uri="file://?fileExist=Append"></to>
            </pipeline>
            <pipeline>
                <to uri="sql:{{sql.xxxx.insertQuery}}"></to>
            </pipeline>
        </multicast>
    </route>
</routeContext>

使Camel路由并行运行

2 个答案: