FlatFileItemReader选项卡定界符不起作用

时间:2018-08-08 04:39:09

标签: java spring spring-batch

我从春季结帐了这个项目: https://github.com/spring-guides/gs-batch-processing

来源:https://spring.io/guides/gs/batch-processing/

我将'sample-data.csv'中的','替换为'tab':

Jill    Doe
Joe Doe
Justin  Doe
Jane    Doe
John    Doe

然后我将新的分隔符添加到阅读器:

@Bean
public FlatFileItemReader<Person> reader() {
    return new FlatFileItemReaderBuilder<Person>()
        .name("personItemReader")
        .resource(new ClassPathResource("sample-data.csv"))
        .delimited()
        .delimiter(DelimitedLineTokenizer.DELIMITER_TAB) // NEW DELIMITER
        .names(new String[]{"firstName", "lastName"})
        .fieldSetMapper(new BeanWrapperFieldSetMapper<Person>() {{
            setTargetType(Person.class);
        }})
        .build();
}

启动时出现此错误:

Caused by: org.springframework.batch.item.file.transform.IncorrectTokenCountException: Incorrect number of tokens found in record: expected 2 actual 1
    at org.springframework.batch.item.file.transform.AbstractLineTokenizer.tokenize(AbstractLineTokenizer.java:142) ~[spring-batch-infrastructure-4.0.1.RELEASE.jar:4.0.1.RELEASE]
    at org.springframework.batch.item.file.mapping.DefaultLineMapper.mapLine(DefaultLineMapper.java:43) ~[spring-batch-infrastructure-4.0.1.RELEASE.jar:4.0.1.RELEASE]
    at org.springframework.batch.item.file.FlatFileItemReader.doRead(FlatFileItemReader.java:180) ~[spring-batch-infrastructure-4.0.1.RELEASE.jar:4.0.1.RELEASE]
    ... 50 common frames omitted

我尝试使用'@'分隔符->它有效。 由于某些原因,我无法使其与制表符分隔符一起使用...

当然,在我的真实项目中,我有一个带有'tab'分隔符的输入文件...

这里有解决方案吗?

2 个答案:

答案 0 :(得分:2)

您不能以这种方式设置制表符分隔符。由于制表符('\ t')不包含任何实际文本,因此DelimitedLineTokenizer中静态DelimitedBuilder类中的FlatFileItemReaderBuilder.java会将其忽略。可以使用上面在问题中给出的代码来设置任何非空格分隔符。

FlatFileItemReaderBuilder sourceCode

这是在LineTokenizer中构建FlatFileItemReaderBuilder.java实例的方式。

public DelimitedLineTokenizer build() {
        Assert.notNull(this.fieldSetFactory, "A FieldSetFactory is required.");
        Assert.notEmpty(this.names, "A list of field names is required");

        DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();

        tokenizer.setNames(this.names.toArray(new String[this.names.size()]));

        // the hasText ignores the tab delimiter.

        if(StringUtils.hasText(this.delimiter)) {
            tokenizer.setDelimiter(this.delimiter);
        }
// more code

因此,要解决此问题,您需要提供使用制表符分隔符明确配置的Type DelimitedLineTokenizer类型的Bean。

在spring配置文件中使用以下代码来设置制表符分隔符:

@Bean
public FlatFileItemReader<Person> reader() {
    return new FlatFileItemReaderBuilder<Person>().name("personItemReader")
            .resource(new ClassPathResource("sample-data.csv"))
            .lineMapper(lineMapper()).build();
}

@Bean
public DefaultLineMapper<Person> lineMapper(){
      DefaultLineMapper<Person> lineMapper = new DefaultLineMapper<>();
      lineMapper.setLineTokenizer(lineTokenizer());
      lineMapper.setFieldSetMapper(new BeanWrapperFieldSetMapper<Person>() {
                {
                    setTargetType(Person.class);
                }
            });
      return lineMapper;
}

@Bean
public DelimitedLineTokenizer lineTokenizer() {
    DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer(DelimitedLineTokenizer.DELIMITER_TAB);
    tokenizer.setNames(new String[] { "firstName", "lastName" });
    return tokenizer;
}

答案 1 :(得分:2)

简单方法:

@Bean
public FlatFileItemReader<Person> reader() {
    return new FlatFileItemReaderBuilder<Person>()
            .name("personItemReader")
            .resource(new ClassPathResource("sample-data.csv"))
            .lineTokenizer(new DelimitedLineTokenizer(DelimitedLineTokenizer.DELIMITER_TAB) {{
                setNames(new String[]{"firstName", "lastName"});
            }})
            .fieldSetMapper(new BeanWrapperFieldSetMapper<Person>() {{
                setTargetType(Person.class);
            }})
            .build();
}