Spring Integration Inbound-Channel-Adapter可逐行读取大型文件

时间:2014-11-21 15:26:40

标签: java spring spring-batch spring-integration

我目前正在使用Spring Integration 4.1.0和Spring 4.1.2。 我要求能够逐行读取文件并使用每行读取作为消息。基本上我想允许"重播"对于我们的某个消息源,但消息不会保存在单个文件中,而是保存在单个文件中。我对此用例没有交易要求。 我的要求类似于此帖子,除了驻留在与运行JVM的服务器相同的服务器上的文件:spring integration - read a remote file line by line

在我看来,我有以下选择:

1。使用int-file:inbound-channel-adapter读取文件然后"拆分"该文件使1条消息现在成为多条消息。 示例配置文件:

    <?xml version="1.0" encoding="UTF-8"?>
    <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:context="http://www.springframework.org/schema/context" xmlns:int="http://www.springframework.org/schema/integration" xmlns:int-jms="http://www.springframework.org/schema/integration/jms" xmlns:int-file="http://www.springframework.org/schema/integration/file" xmlns:task="http://www.springframework.org/schema/task"
        xsi:schemaLocation="http://www.springframework.org/schema/jms http://www.springframework.org/schema/jms/spring-jms.xsd
            http://www.springframework.org/schema/integration http://www.springframework.org/schema/integration/spring-integration.xsd
            http://www.springframework.org/schema/integration/file http://www.springframework.org/schema/integration/file/spring-integration-file.xsd
            http://www.springframework.org/schema/integration/jms http://www.springframework.org/schema/integration/jms/spring-integration-jms.xsd
            http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
            http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd
            http://www.springframework.org/schema/task http://www.springframework.org/schema/task/spring-task.xsd">

        <int-file:inbound-channel-adapter id="filereader" directory="/tmp" filename-pattern="myfile.txt" channel="channel1"/>
        <int-file:file-to-string-transformer input-channel="channel1" output-channel="channel2"/>
        <int:channel id="channel1"/>
        <int:splitter input-channel="channel2" output-channel="nullChannel"/>
        <int:channel id="channel2"/>
    </beans>

问题是文件非常大,使用上述技术时,首先将整个文件读入内存,然后拆分,JVM就会耗尽堆空间。确实需要的步骤是:读取一行并将行转换为消息,发送消息,从内存中删除消息,重复。

  1. int-file:tail-inbound-channel-adapterend="false"一起使用(基本上表示从文件的开头读取)。根据需要为每个文件启动和停止此适配器(在每次启动之前更改文件名)。 示例配置文件:

    <?xml version="1.0" encoding="UTF-8"?>
    <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:context="http://www.springframework.org/schema/context" xmlns:int="http://www.springframework.org/schema/integration" xmlns:int-jms="http://www.springframework.org/schema/integration/jms" xmlns:int-file="http://www.springframework.org/schema/integration/file" xmlns:task="http://www.springframework.org/schema/task"
        xsi:schemaLocation="http://www.springframework.org/schema/jms http://www.springframework.org/schema/jms/spring-jms.xsd
            http://www.springframework.org/schema/integration http://www.springframework.org/schema/integration/spring-integration.xsd
            http://www.springframework.org/schema/integration/file http://www.springframework.org/schema/integration/file/spring-integration-file.xsd
            http://www.springframework.org/schema/integration/jms http://www.springframework.org/schema/integration/jms/spring-integration-jms.xsd
            http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
            http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd
            http://www.springframework.org/schema/task http://www.springframework.org/schema/task/spring-task.xsd">
    
        <int-file:tail-inbound-channel-adapter id="apache"
            channel="exchangeSpringQueueChannel"
            task-executor="exchangeFileReplayTaskExecutor"
            file="C:\p2-test.txt"
            delay="1"
            end="false"
            reopen="true"
            file-delay="10000" />
    
        <int:channel id="exchangeSpringQueueChannel" />
        <task:executor id="exchangeFileReplayTaskExecutor" pool-size="1" />
    </beans>
    
  2. 将Spring Integration调用到Spring Batch并使用ItemReader来处理该文件。当然可以在整个过程中进行更细粒度的控制,但需要相当多的工作来设置工作存储库等等(而且我不关心工作历史,所以我要告诉工作不记录状态和/或使用内存中MapJobRepository)。

  3. 4。通过扩展FileLineByLineInboundChannelAdapter创建我自己的MessageProducerSupport。 大部分代码都可以从ApacheCommonsFileTailingMessageProducer借用(另见http://forum.spring.io/forum/spring-projects/integration/119897-custom-upd-inbound-channel-adapter)。下面是一个示例,但需要做一些工作才能将读数放入其自己的Thread中,以便在逐行阅读时尊重stop()命令。

        package com.xxx.exchgateway.common.util.springintegration;
    
        import java.io.BufferedReader;
        import java.io.File;
        import java.io.FileInputStream;
        import java.io.FileNotFoundException;
        import java.io.IOException;
        import java.io.InputStreamReader;
        import org.apache.commons.io.IOUtils;
        import org.springframework.core.task.SimpleAsyncTaskExecutor;
        import org.springframework.core.task.TaskExecutor;
        import org.springframework.integration.core.MessageSource;
        import org.springframework.integration.endpoint.MessageProducerSupport;
        import org.springframework.integration.file.FileHeaders;
        import org.springframework.messaging.Message;
        import org.springframework.util.Assert;
    
        /**
         * A lot of the logic for this class came from {@link #ApacheCommonsFileTailingMessageProducer}.
         * See {@link http://forum.spring.io/forum/spring-projects/integration/119897-custom-upd-inbound-channel-adapter}
         */
        public class FileLineByLineInboundChannelAdapter extends MessageProducerSupport implements MessageSource<String> {
            private volatile File file;
    
            /**
             * The name of the file you wish to tail.
             * @param file The absolute path of the file.
             */
            public void setFile(File file) {
                Assert.notNull("'file' cannot be null");
                this.file = file;
            }
    
            protected File getFile() {
                if (this.file == null) {
                    throw new IllegalStateException("No 'file' has been provided");
                }
                return this.file;
            }
    
            @Override
            public String getComponentType() {
                return "file:line-by-line-inbound-channel-adapter";
            }
    
            private void readFile() {
                FileInputStream fstream;
                try {
                    fstream = new FileInputStream(getFile());
    
                    BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
    
                    String strLine;
    
                    // Read File Line By Line, make sure we honor if someone manually sets the isRunning=false (via clicking the stop() method in JMX)
                    while ((strLine = br.readLine()) != null && isRunning()) {
                        send(strLine);
                    }
    
                    //Close the input stream
                    IOUtils.closeQuietly(br);
                    IOUtils.closeQuietly(fstream);
                } catch (FileNotFoundException e) {
                    // TODO Auto-generated catch block
                    e.printStackTrace();
                } catch (IOException e) {
                    // TODO Auto-generated catch block
                    e.printStackTrace();
                }
            }
    
            @Override
            protected void doStart() {
                super.doStart();
    
                // TODO this needs to be moved into it's own thread since isRunning() will return "false" until this method has completed
                // and we want to honor the stop() command while we read line-by-line
                readFile();
            }
    
            protected void send(String line) {
                Message<?> message = this.getMessageBuilderFactory().withPayload(line).setHeader(FileHeaders.FILENAME, this.file.getAbsolutePath()).build();
                super.sendMessage(message);
            }
    
            @Override
            public Message<String> receive() {
                // TODO Auto-generated method stub
                return null;
            }
        }
    

    在我看来,我的用例不在人们可能喜欢的典型事物范围之内,所以我很惊讶我无法找到解决方法 - 的即装即用。我搜索了很多但看了很多例子,遗憾的是还没找到符合我需求的东西。

    我假设我可能错过了框架已经提供的明显的东西(尽管这可能属于Spring Integraton和Spring Batch之间的模糊界限)。如果我完全偏离了我的想法,或者是否有一个我错过的简单解决方案或者提供其他建议,有人可以告诉我吗?

2 个答案:

答案 0 :(得分:3)

Spring Integration 4.x有一个很好的新功能,即使用Iterator作为消息:

Spring Integration Reference

  

从版本4.1开始,AbstractMessageSplitter支持要拆分的值的Iterator类型。

这允许将Iterator作为不读取整个文件的消息发送到内存中。

Here is一个Spring Context的简单示例,每行将CSV文件拆分为一个消息:

<int-file:inbound-channel-adapter 
        directory="${inputFileDirectory:/tmp}"
        channel="inputFiles"/>

<int:channel id="inputFiles">
    <int:dispatcher task-executor="executor"/>
</int:channel>

<int:splitter 
    input-channel="inputFiles" 
    output-channel="output">
    <bean 
        class="FileSplitter" 
        p:commentPrefix="${commentPrefix:#}" />
</int:splitter>

<task:executor 
    id="executor" 
    pool-size="${poolSize:8}" 
    queue-capacity="${aueueCapacity:0}" 
    rejection-policy="CALLER_RUNS" />

<int:channel id="output"/>

这是splitter implementation

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.util.Iterator;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.integration.splitter.AbstractMessageSplitter;
import org.springframework.integration.transformer.MessageTransformationException;
import org.springframework.messaging.Message;
import org.springframework.util.Assert;

public class FileSplitter extends AbstractMessageSplitter {
    private static final Logger log = LoggerFactory.getLogger(FileSplitter.class);

    private String commentPrefix = "#";

    public Object splitMessage(Message<?> message) {
        if(log.isDebugEnabled()) {
            log.debug(message.toString());
        }
        try {

            Object payload = message.getPayload();
            Assert.isInstanceOf(File.class, payload, "Expected java.io.File in the message payload"); 

            return new BufferedReaderFileIterator((File) payload);
        } 
        catch (IOException e) {
            String msg = "Unable to transform file: " + e.getMessage();
            log.error(msg);
            throw new MessageTransformationException(msg, e);
        }
    }

    public void setCommentPrefix(String commentPrefix) {
        this.commentPrefix = commentPrefix;
    }

    public class BufferedReaderFileIterator implements Iterator<String> {

        private File file;
        private BufferedReader bufferedReader;
        private String line;

        public BufferedReaderFileIterator(File file) throws IOException {
            this.file = file;
            this.bufferedReader = new BufferedReader(new FileReader(file));
            readNextLine();
        }

        @Override
        public boolean hasNext() {
            return line != null;
        }

        @Override
        public String next() {
            try {
                String res = this.line;
                readNextLine();
                return res;
            } 
            catch (IOException e) {
                log.error("Error reading file", e);
                throw new RuntimeException(e);
            }   
        }

        void readNextLine() throws IOException {
            do {
                line = bufferedReader.readLine();
            }
            while(line != null && line.trim().startsWith(commentPrefix));

            if(log.isTraceEnabled()) {
                log.trace("Read next line: {}", line);
            }

            if(line == null) {
                close();
            }
        }

        void close() throws IOException {
            bufferedReader.close();
            file.delete();
        }

        @Override
        public void remove() {
            throw new UnsupportedOperationException();
        }

    }

}

请注意从splitMessage()处理程序方法返回的Iterator对象。

答案 1 :(得分:0)

我也有这个,我也将文件复制到另一个文件夹并从文件中读取数据

fileCopyApplicationContext.xml

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:int="http://www.springframework.org/schema/integration"
    xmlns:file="http://www.springframework.org/schema/integration/file"
    xmlns:context="http://www.springframework.org/schema/context" xmlns:p="http://www.springframework.org/schema/p"
    xsi:schemaLocation="http://www.springframework.org/schema/beans
            http://www.springframework.org/schema/beans/spring-beans.xsd
            http://www.springframework.org/schema/integration
            http://www.springframework.org/schema/integration/spring-integration.xsd
            http://www.springframework.org/schema/integration/file
            http://www.springframework.org/schema/integration/file/spring-integration-file.xsd
            http://www.springframework.org/schema/context 
            http://www.springframework.org/schema/context/spring-context.xsd">

    <context:property-placeholder />

    <file:inbound-channel-adapter id="filesIn"
        directory="E:/usmandata/logs/input/" filter="onlyPropertyFiles"
        auto-startup="true">
        <int:poller id="poller" fixed-delay="500" />
    </file:inbound-channel-adapter>



    <int:service-activator input-channel="filesIn"
        output-channel="filesOut" ref="handler" />

    <file:outbound-channel-adapter id="filesOut"
        directory="E:/usmandata/logs/output/" />




    <bean id="handler" class="com.javarticles.spring.integration.file.FileHandler" />
    <bean id="onlyPropertyFiles"
        class="org.springframework.integration.file.config.FileListFilterFactoryBean"
        p:filenamePattern="*.log" />
</beans>

FileHandler.java

package com.javarticles.spring.integration.file;

import java.io.File;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;

public class FileHandler {
    public File handleFile(File input) throws IOException {
       // System.out.println("Copying file: " + input.getAbsolutePath());


        RandomAccessFile file = new RandomAccessFile(input,"r");

        FileChannel channel = file.getChannel();

        //System.out.println("File size is: " + channel.size());

        ByteBuffer buffer = ByteBuffer.allocate((int) channel.size());

        channel.read(buffer);

        buffer.flip();//Restore buffer to position 0 to read it

        System.out.println("Reading content and printing ... ");

        for (int i = 0; i < channel.size(); i++) {
            System.out.print((char) buffer.get());
        }

        channel.close();
        file.close();
        return input;
    }
}

SpringIntegrationFileCopyExample.java

package com.javarticles.spring.integration.file;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.Properties;

import org.springframework.context.support.ClassPathXmlApplicationContext;

public class SpringIntegrationFileCopyExample {

    public static void main(String[] args) throws InterruptedException, IOException {
        ClassPathXmlApplicationContext context = new ClassPathXmlApplicationContext(
                "fileCopyApplicationContext.xml");

    }

}