使用Spring Integration同时读取CSV文件

时间:2014-11-27 13:34:48

标签: spring-integration

我想使用spring集成同时处理CSV文件。每行将转换为单独的消息。所以假设我在CSV文件中有10K行,我想启动10个Thread,每一行都会传递给这个Thread。如果有人向我展示任何示例,那将会很棒。

谢谢

2 个答案:

答案 0 :(得分:1)

Spring Integration 4.0开始,<splitter>支持Iterator作为payload进行拆分。因此,如果File LineIteratoroutput-channel,您可以将入站<splitter>转换为ExecutorChannel并处理并行中每行的消息:

<splitter input-channel="splitChannel" output-channel="executorChannel"
          expression="T(org.apache.commons.io.FileUtils).lineIterator(payload)"/>

答案 1 :(得分:1)

我正在使用Spring Integration 4.1.0并尝试过您的建议,但它似乎对我不起作用。 我今天已经对此进行了一些调查,现在倾向于将其作为Spring Integration 4.1.0错误。

看看我解释的是否有意义。

如果您尝试此示例,您会看到它会起作用(请注意,这不是使用您的SpEL示例):

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:context="http://www.springframework.org/schema/context" xmlns:int="http://www.springframework.org/schema/integration" xmlns:int-file="http://www.springframework.org/schema/integration/file" xmlns:int-stream="http://www.springframework.org/schema/integration/stream" xmlns:task="http://www.springframework.org/schema/task"
    xsi:schemaLocation="http://www.springframework.org/schema/jms http://www.springframework.org/schema/jms/spring-jms.xsd
        http://www.springframework.org/schema/integration http://www.springframework.org/schema/integration/spring-integration.xsd
        http://www.springframework.org/schema/integration/file http://www.springframework.org/schema/integration/file/spring-integration-file.xsd
        http://www.springframework.org/schema/integration/stream http://www.springframework.org/schema/integration/stream/spring-integration-stream.xsd
        http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
        http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd
        http://www.springframework.org/schema/task http://www.springframework.org/schema/task/spring-task.xsd">

    <int:inbound-channel-adapter id="exchangeReplayFileAdapter" ref="exchangeReplayFileReadingMessageSource" method="receive" auto-startup="true" channel="channel1">
        <int:poller fixed-delay="10000000" />
    </int:inbound-channel-adapter>

    <bean id="exchangeReplayFileReadingMessageSource" class="org.springframework.integration.file.FileReadingMessageSource">
        <property name="directory" value="/tmp/inputdir" />
    </bean>

    <int:channel id="channel1">
        <int:dispatcher task-executor="taskExecutor" />
    </int:channel>

    <int:splitter input-channel="channel1" output-channel="channel2">
        <bean class="com.xxx.common.util.springintegration.FileSplitter" />
    </int:splitter>

    <int:channel id="channel2"></int:channel>
    <int-stream:stdout-channel-adapter channel="channel2"></int-stream:stdout-channel-adapter>

    <task:executor id="taskExecutor" pool-size="1" />
</beans>

使用此Splitter实施...

package com.xxx.common.util.springintegration;

import java.io.File;
import java.io.IOException;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.integration.splitter.AbstractMessageSplitter;
import org.springframework.integration.transformer.MessageTransformationException;
import org.springframework.messaging.Message;
import org.springframework.util.Assert;

public class FileSplitter extends AbstractMessageSplitter {
    private static final Logger log = LoggerFactory.getLogger(FileSplitterNew.class);

    public Object splitMessage(Message<?> message) {
        if (log.isDebugEnabled()) {
            log.debug(message.toString());
        }
        try {

            Object payload = message.getPayload();
            Assert.isInstanceOf(File.class, payload, "Expected java.io.File in the message payload");
            return org.apache.commons.io.FileUtils.lineIterator((File) payload);
        } catch (IOException e) {
            String msg = "Unable to transform file: " + e.getMessage();
            log.error(msg);
            throw new MessageTransformationException(msg, e);
        }
    }

}

使用您的SpEL示例:

<int:splitter input-channel="exchangeReplayFiles" output-channel="exchangeSpringQueueChannel"  
    expression="T(org.apache.commons.io.FileUtils).lineIterator(payload)"/>

解析器在内部创建的是这个(注意传递给List.class构造函数的ExpressionEvaluatingMessageProcessor类型:

/**
 * A Message Splitter implementation that evaluates the specified SpEL
 * expression. The result of evaluation will typically be a Collection or
 * Array. If the result is not a Collection or Array, then the single Object
 * will be returned as the payload of a single reply Message.
 *
 * @author Mark Fisher
 * @author Gary Russell
 * @since 2.0
 */
public class ExpressionEvaluatingSplitter extends AbstractMessageProcessingSplitter {

    @SuppressWarnings({"unchecked", "rawtypes"})
    public ExpressionEvaluatingSplitter(Expression expression) {
        super(new ExpressionEvaluatingMessageProcessor(expression, List.class));
    }

}

ExpressionEvaluatingMessageProcessor类:

/**
 * A {@link MessageProcessor} implementation that evaluates a SpEL expression
 * with the Message itself as the root object within the evaluation context.
 *
 * @author Mark Fisher
 * @author Artem Bilan
 * @since 2.0
 */
public class ExpressionEvaluatingMessageProcessor<T> extends AbstractMessageProcessor<T> {

    private final Expression expression;

    private final Class<T> expectedType;


  ...
    /**
     * Create an {@link ExpressionEvaluatingMessageProcessor} for the given expression
     * and expected type for its evaluation result.
     * @param expression The expression.
     * @param expectedType The expected type.
     */
    public ExpressionEvaluatingMessageProcessor(Expression expression, Class<T> expectedType) {
        Assert.notNull(expression, "The expression must not be null");
        try {
            this.expression = expression;
            this.expectedType = expectedType;
        }
        catch (ParseException e) {
            throw new IllegalArgumentException("Failed to parse expression.", e);
        }
    }

    /**
     * Processes the Message by evaluating the expression with that Message as the
     * root object. The expression evaluation result Object will be returned.
     * @param message The message.
     * @return The result of processing the message.
     */
    @Override
    public T processMessage(Message<?> message) {
        return this.evaluateExpression(this.expression, message, this.expectedType);
    }
...

}

从提供的示例返回的内容最终是包含单个ArrayList元素的Collection(实现LineIterator接口)。

ExpressionEvaluatingSplitterAbstractMessageSplitter的子类,它不会覆盖handleRequestMessage(Message<?> message)方法。
该方法如下所示:

public abstract class AbstractMessageSplitter extends AbstractReplyProducingMessageHandler {
    protected final Object handleRequestMessage(Message<?> message) {
        Object result = this.splitMessage(message);
        // return null if 'null'
        if (result == null) {
            return null;
        }

        Iterator<Object> iterator;
        final int sequenceSize;
        if (result instanceof Collection) {
            Collection<Object> items = (Collection<Object>) result;
            sequenceSize = items.size();
            iterator = items.iterator();
        }
        else if (result.getClass().isArray()) {
            Object[] items = (Object[]) result;
            sequenceSize = items.length;
            iterator = Arrays.asList(items).iterator();
        }
        else if (result instanceof Iterable<?>) {
            sequenceSize = 0;
            iterator = ((Iterable<Object>) result).iterator();
        }
        else if (result instanceof Iterator<?>) {
            sequenceSize = 0;
            iterator = (Iterator<Object>) result;
        }
        else {
            sequenceSize = 1;
            iterator = Collections.singleton(result).iterator();
        }

        if (!iterator.hasNext()) {
            return null;
        }

        final MessageHeaders headers = message.getHeaders();
        final Object correlationId = headers.getId();
        final AtomicInteger sequenceNumber = new AtomicInteger(1);

        return new FunctionIterator<Object, AbstractIntegrationMessageBuilder<?>>(iterator,
                new Function<Object, AbstractIntegrationMessageBuilder<?>>() {
                    @Override
                    public AbstractIntegrationMessageBuilder<?> apply(Object object) {
                        return createBuilder(object, headers, correlationId, sequenceNumber.getAndIncrement(),
                                sequenceSize);
                    }
                });
    }

由于ArrayList确实是Collection,因此它永远不会到达设置迭代器的逻辑,因此永远不会调用next()中迭代器上的produceOutput(...)方法

那么为什么LineIterator被排列成ArrayList?我相信ExpressionEvaluatingSplitter中有一个缺陷,因为它始终如此:

public ExpressionEvaluatingSplitter(Expression expression) {
    super(new ExpressionEvaluatingMessageProcessor(expression, List.class));
}

我认为在Spring Integration 4中,它现在应该查看表达式求值的类型(ListIterator)然后调用super(可能需要重做如何完成决定类型将在调用JVM不允许的超级之前完成。

您怎么看?