Spring Batch:忽略与指定模式不匹配的任何行

时间:2014-06-17 00:13:24

标签: java annotations spring-batch

我有一个要求,我在那里读取具有不同类型输入的文件,如下所示:

*JAMBEG,APP=000007,123456
AC,654321,“ABCD12121212121212”,23423423423424234,ABCDD,23423423423424234,2424,XYZ,ABC,TREX,000000002
AC,654321,“ABCD12121212121213”,23423423423424234,ABCDD,23423423423424234,2424,XYZ,ABC, TREX,000000002
...
AC,654321,“ABCD12121212121214”,23423423423424234,ABCDD,23423423423424234,2424,XYZ,ABC, TREX,000000002
*JAMEND,APP=000007,123456
EOF

我只需要Header行和后面的记录,忽略以TREX开头的行,* JAMEND,EOF。

以下是我的线路映射器:

public LineMapper<Customer> lineMapper(){

    DelimitedLineTokenizer lineTokenizerHeader = new DelimitedLineTokenizer();
    lineTokenizerHeader.setNames(new String[]{"association","companyNumber","fileDate"});
    lineTokenizerHeader.setIncludedFields(new int[]{0,1,2});
    lineTokenizerHeader.setStrict(false);

    DelimitedLineTokenizer lineTokenizerBody = new DelimitedLineTokenizer();
    lineTokenizerBody.setNames(new String[]{"type","acNumber","orderNumber"});
    lineTokenizerBody.setIncludedFields(new int[]{0,1,2});
    lineTokenizerBody.setStrict(false);


    HashMap<String, DelimitedLineTokenizer> tokenizers = new HashMap<String, DelimitedLineTokenizer>();
    tokenizers.put("*BEG*", lineTokenizerHeader);
    tokenizers.put("AC*", lineTokenizerBody);

    BeanWrapperFieldSetMapper<Customer> beanWrapperFieldSetMapper = new BeanWrapperFieldSetMapper<Customer>();
    beanWrapperFieldSetMapper.setTargetType(Customer.class);
    beanWrapperFieldSetMapper.setStrict(false);

    HashMap<String, BeanWrapperFieldSetMapper<Customer>> fieldSetMappers = new HashMap<String, BeanWrapperFieldSetMapper<Customer>>();
    fieldSetMappers.put("*BEG*", beanWrapperFieldSetMapper);
    fieldSetMappers.put("AC*", beanWrapperFieldSetMapper);

    PatternMatchingCompositeLineMapper patternMatchingCompositeLineMapper = new PatternMatchingCompositeLineMapper();
    patternMatchingCompositeLineMapper.setTokenizers(tokenizers);
    patternMatchingCompositeLineMapper.setFieldSetMappers(fieldSetMappers);

    return patternMatchingCompositeLineMapper;
}

我明显的错误是我没有TREX,* JAMEND,EOF模式的映射。因此它引发了以下异常:

  

2014-06-16 16:49:34,746 [main] DEBUG   org.springframework.batch.core.step.item.FaultTolerantChunkProvider -   第1行中的解析错误:资源= [类路径资源   [0000123456.csv]],输入= [EOF]:   org.springframework.batch.item.file.FlatFileParseException 2014-06-16   16:49:34,746 [主要] DEBUG   org.springframework.batch.core.step.item.FaultTolerantChunkProvider -   跳过失败的输入   org.springframework.batch.item.file.FlatFileParseException:解析   行中的错误:资源中的5 = [类路径资源[0000123456.csv]],   输入= [EOF] at   org.springframework.batch.item.file.FlatFileItemReader.doRead(FlatFileItemReader.java:183)     在   org.springframework.batch.item.support.AbstractItemCountingItemStreamItemReader.read(AbstractItemCountingItemStreamItemReader.java:83)     在   org.springframework.batch.core.step.item.SimpleChunkProvider.doRead(SimpleChunkProvider.java:91)     在   org.springframework.batch.core.step.item.FaultTolerantChunkProvider.read(FaultTolerantChunkProvider.java:87)     在   org.springframework.batch.core.step.item.SimpleChunkProvider $ 1.doInIteration(SimpleChunkProvider.java:114)     在   org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:368)     在   org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:215)     在   org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:144)     在   org.springframework.batch.core.step.item.SimpleChunkProvider.provide(SimpleChunkProvider.java:108)     在   org.springframework.batch.core.step.item.ChunkOrientedTasklet.execute(ChunkOrientedTasklet.java:69)     在   org.springframework.batch.core.step.tasklet.TaskletStep $ ChunkTransactionCallback.doInTransaction(TaskletStep.java:402)     在   org.springframework.batch.core.step.tasklet.TaskletStep $ ChunkTransactionCallback.doInTransaction(TaskletStep.java:326)     在   org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:130)     在   org.springframework.batch.core.step.tasklet.TaskletStep $ 2.doInChunkContext(TaskletStep.java:267)     在   org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:77)     在   org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:368)     在   org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:215)     在   org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:144)     在   org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:253)     在   org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:198)     在   org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:148)     在   org.springframework.batch.core.job.AbstractJob.handleStep(AbstractJob.java:386)     在   org.springframework.batch.core.job.SimpleJob.doExecute(SimpleJob.java:135)     在   org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:304)     在   org.springframework.batch.core.launch.support.SimpleJobLauncher $ 1.run(SimpleJobLauncher.java:135)     在   org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:48)     在   org.springframework.batch.core.launch.support.SimpleJobLauncher.run(SimpleJobLauncher.java:128)     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at   sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)     在   sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)     在java.lang.reflect.Method.invoke(Method.java:597)at   org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:318)     在   org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)     在   org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)     在   org.springframework.batch.core.configuration.annotation.SimpleBatchConfiguration $ PassthruAdvice.invoke(SimpleBatchConfiguration.java:117)     在   org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)     在   org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)     在com.sun.proxy。$ Proxy17.run(未知来源)at   com.chofac.pm.batch.CustomerFileToDBJobTest.testLaunchJob(CustomerFileToDBJobTest.java:48)     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at   sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)     在   sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)     在java.lang.reflect.Method.invoke(Method.java:597)at   org.junit.runners.model.FrameworkMethod $ 1.runReflectiveCall(FrameworkMethod.java:45)     在   org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)     在   org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)     在   org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)     在   org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)     在   org.springframework.test.context.junit4.statements.RunBeforeTestMethodCallbacks.evaluate(RunBeforeTestMethodCallbacks.java:74)     在   org.springframework.test.context.junit4.statements.RunAfterTestMethodCallbacks.evaluate(RunAfterTestMethodCallbacks.java:83)     在   org.springframework.test.context.junit4.statements.SpringRepeat.evaluate(SpringRepeat.java:72)     在   org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:231)     在   org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)     在org.junit.runners.ParentRunner $ 3.run(ParentRunner.java:231)at   org.junit.runners.ParentRunner $ 1.schedule(ParentRunner.java:60)at   org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)at at   org.junit.runners.ParentRunner.access $ 000(ParentRunner.java:50)at   org.junit.runners.ParentRunner $ 2.evaluate(ParentRunner.java:222)at at   org.springframework.test.context.junit4.statements.RunBeforeTestClassCallbacks.evaluate(RunBeforeTestClassCallbacks.java:61)     在   org.springframework.test.context.junit4.statements.RunAfterTestClassCallbacks.evaluate(RunAfterTestClassCallbacks.java:71)     在org.junit.runners.ParentRunner.run(ParentRunner.java:300)at   org.springframework.test.context.junit4.SpringJUnit4ClassRunner.run(SpringJUnit4ClassRunner.java:174)     在   org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)     在   org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)     在   org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)     在   org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)     在   org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)     在   org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)   引起:java.lang.IllegalStateException:找不到匹配项   key的模式= [EOF] at   org.springframework.batch.support.PatternMatcher.match(PatternMatcher.java:226)     在   org.springframework.batch.item.file.mapping.PatternMatchingCompositeLineMapper.mapLine(PatternMatchingCompositeLineMapper.java:62)     在   org.springframework.batch.item.file.FlatFileItemReader.doRead(FlatFileItemReader.java:180)     ... 67更多

我查看了许多示例,this one匹配关闭,并改变了我的步骤,但仍然是同样的问题。

@Bean
public Step step(){
    return stepBuilders.get("step")
    .<Customer,Customer>chunk(1)
    .reader(CustomerAUFileReader())
    .faultTolerant()
    .skipLimit(3)
    .skip(Exception.class)
    .processor(CustomerRecordProcessor())
    .writer(CustomerDBWriter())
    .listener(logProcessListener())
    .build();
}

查看Spring.io docs here跳过记录(5.1.5配置跳过逻辑),也不起作用。

请让我知道解决此问题的理想方法。是否有一种简单的方法来指定与特定情况不匹配的跳过记录?请指教。感谢。

---

我有一个模式映射器,用于&#39; *&#39;我用虚拟类映射。我在进程阶段返回null但它抛出nullpointerexception。

堆栈追踪:

2014-06-17 10:03:01,690 [main] DEBUG org.springframework.batch.core.step.item.FaultTolerantChunkProcessor - Skipping after failed process
org.springframework.batch.core.listener.StepListenerFailedException: Error in afterProcess.
    at org.springframework.batch.core.listener.MulticasterBatchListener.afterProcess(MulticasterBatchListener.java:136)
    at org.springframework.batch.core.step.item.SimpleChunkProcessor.doProcess(SimpleChunkProcessor.java:127)
    at org.springframework.batch.core.step.item.FaultTolerantChunkProcessor$1.doWithRetry(FaultTolerantChunkProcessor.java:225)
    at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:263)
    at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:193)
    at org.springframework.batch.core.step.item.BatchRetryTemplate.execute(BatchRetryTemplate.java:217)
    at org.springframework.batch.core.step.item.FaultTolerantChunkProcessor.transform(FaultTolerantChunkProcessor.java:290)
    at org.springframework.batch.core.step.item.SimpleChunkProcessor.process(SimpleChunkProcessor.java:192)
    at org.springframework.batch.core.step.item.ChunkOrientedTasklet.execute(ChunkOrientedTasklet.java:75)
    at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:402)
    at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:326)
    at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:130)
    at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:267)
    at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:77)
    at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:368)
    at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:215)
    at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:144)
    at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:253)
    at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:198)
    at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:148)
    at org.springframework.batch.core.job.AbstractJob.handleStep(AbstractJob.java:386)
    at org.springframework.batch.core.job.SimpleJob.doExecute(SimpleJob.java:135)
    at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:304)
    at org.springframework.batch.core.launch.support.SimpleJobLauncher$1.run(SimpleJobLauncher.java:135)
    at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:48)
    at org.springframework.batch.core.launch.support.SimpleJobLauncher.run(SimpleJobLauncher.java:128)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:318)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
    at org.springframework.batch.core.configuration.annotation.SimpleBatchConfiguration$PassthruAdvice.invoke(SimpleBatchConfiguration.java:117)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
    at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
    at com.sun.proxy.$Proxy17.run(Unknown Source)
    at com.chofac.pl.batch.CustomerFileToDBJobTest.testLaunchJob(CustomerFileToDBJobTest.java:48)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
    at org.springframework.test.context.junit4.statements.RunBeforeTestMethodCallbacks.evaluate(RunBeforeTestMethodCallbacks.java:74)
    at org.springframework.test.context.junit4.statements.RunAfterTestMethodCallbacks.evaluate(RunAfterTestMethodCallbacks.java:83)
    at org.springframework.test.context.junit4.statements.SpringRepeat.evaluate(SpringRepeat.java:72)
    at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:231)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
    at org.springframework.test.context.junit4.statements.RunBeforeTestClassCallbacks.evaluate(RunBeforeTestClassCallbacks.java:61)
    at org.springframework.test.context.junit4.statements.RunAfterTestClassCallbacks.evaluate(RunAfterTestClassCallbacks.java:71)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
    at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.run(SpringJUnit4ClassRunner.java:174)
    at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
    at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
Caused by: java.lang.NullPointerException
    at com.chofac.pl.batch.CustomerItemProcessListener.afterProcess(CustomerItemProcessListener.java:13)
    at org.springframework.batch.core.listener.CompositeItemProcessListener.afterProcess(CompositeItemProcessListener.java:60)
    at org.springframework.batch.core.listener.MulticasterBatchListener.afterProcess(MulticasterBatchListener.java:133)

5 个答案:

答案 0 :(得分:5)

一种方法:

ItemReader只读一行并按原样返回。因此,读者给出的项目将是一个简单的字符串。

编写一个简单的ItemProcessor,它主要根据模式执行LineMapper的工作,例如:如果项目与模式匹配,则将输入字符串转换为{{1返回。如果模式不匹配,只需返回Customer,就会跳过该项。

项目处理器的伪代码:

null

甚至更干净:让一个处理器完成从String到Customer的映射工作,另一个处理器在regex上进行字符串基础验证。

使用class CustomPatternMatchingItemProcessor<String, Customer> implements ItemProcessor<String, Customer> { private String pattern; public Customer process(String s) { if (s matches pattern) { construct Customer object base on s return customer } else { return null; } } } 链接您的处理器。这样可以更好地分离每个处理器的关注点。

答案 1 :(得分:1)

您的意图不是因错误而跳过对象,而是跳过带有逻辑的记录;我认为您的最佳选择是将映射器绑定到'*'并返回自定义对象(如SkippableRecordBean)而不是Customer,并在ItemProcessor中过滤掉不需要的bean。

答案 2 :(得分:1)

我有一个类似的问题,我发送的文件总是会有一些不符合正确模式的行。因为我只想忽略这些行,所以我会记录它们,然后跳过它们。

你可以实施 org.springframework.batch.repeat.exception.ExceptionHandler喜欢这样:

class LocalExceptionHandler implements ExceptionHandler {

    @Override
    public void handleException(RepeatContext rc, Throwable throwable) throws Throwable {
        if (throwable instanceof FlatFileParseException) {
            FlatFileParseException fe = (FlatFileParseException)throwable;
            log.error("!!!! FlatFileParseException, line # is: " + fe.getLineNumber());
            log.error("!!!! FlatFileParseException, input is: " + fe.getInput());
        }
        log.error("!!!! Message : " + throwable.getMessage());
        log.error("!!!! Cause : " + throwable.getCause());      
    }
}

然后将其添加到您的步骤构建器中:

faultTolerantStepBuilder.exceptionHandler(new LocalExceptionHandler());
faultTolerantStepBuilder.skipLimit(100);

答案 3 :(得分:0)

我们开始从LineMapper返回Dummy对象而不是null,因为null会导致读者跳过读取其他行。在ItemWriter中,write方法仍然获取此虚拟对象,您需要在处理它之前进行验证。我们忽略ItemWriter中的虚拟记录。

答案 4 :(得分:0)

public class FileVerificationSkipper implements SkipPolicy {

    @Autowired
    private Environment environment;

    @Override
    public boolean shouldSkip(Throwable exception, int skipCount) throws SkipLimitExceededException {
        int skipErrorsRecords = Integer.valueOf(environment.getProperty("max.error.record.count"));
        if (exception instanceof FileNotFoundException) {
            return false;
        } else if (exception instanceof FlatFileParseException && (skipErrorsRecords < 0 || skipCount <= (skipErrorsRecords-1))) {
            FlatFileParseException ffpe = (FlatFileParseException) exception;
            StringBuilder errorMessage = new StringBuilder();
            errorMessage.append((skipCount+1)+", An error occured while processing the " + ffpe.getLineNumber() + " record of the file. the faulty record is:\n");
            errorMessage.append(ffpe.getInput() + "\n");
            return true;
        } else {
            return false;
        }
    }
}

StepBuilder是:

@Bean
public Step step1() throws Exception{
    return stepBuilderFactory.get("step1")
            .<User, User> chunk(50)
            .reader(reader())
            .faultTolerant()
            .skipPolicy(fileVerificationSkipper())
            .processor(processor())
            .writer(writer())
            .build();
}