我需要从REST JSON API(每24小时)获取数据,这会返回数组中的大量数据(比方说500MB)。现在,为了做到这一点,我正在考虑使用Spring Batch。我会以块的形式处理JSON(比如每块1000个记录),然后在弹性搜索中批量索引它。问题是所有那些春季批量支持类(JobLauncher
,JobExplorer
,JobRepository
,......)对我来说太过分了。我想我只需要一些可以使用JsonBufferedReaderFactory
和FlatFileItemReader
的可重试的runnable:
@Bean
public FlatFileItemReader<SomeObj> ffir(){
FlatFileItemReader<SomeObj> ffir = new FlatFileItemReader<>();
ffir.setBufferedReaderFactory(new JsonBufferedReaderFactory());
ffir.setLineMapper(new JsonLineMapper<>(SomeObj.class));
try {
ffir.setResource(new UrlResource("http://localhost:8080/cars/759"));
} catch (MalformedURLException e) {
e.printStackTrace();
}
return ffir;
}
读者工厂是:
public class JsonBufferedReaderFactory implements BufferedReaderFactory {
@Override
public BufferedReader create(Resource resource, String encoding) throws IOException {
return new JsonBufferedReader(new InputStreamReader(resource.getInputStream(), encoding));
}
private final class JsonBufferedReader extends BufferedReader {
private final ObjectMapper mapper = new ObjectMapper();
private final JsonFactory factory = mapper.getFactory();
private final JsonParser parser;
private ObjectNode node;
JsonBufferedReader(Reader in) throws IOException {
super(in);
parser = factory.createParser(in);
if (parser.nextToken() != JsonToken.START_ARRAY) {
throw new IllegalStateException("Expected an array");
}
}
@Override
public String readLine() throws IOException {
JsonToken nextToken = parser.nextToken();
if (nextToken == JsonToken.START_OBJECT) {
node = mapper.readTree(parser);
return node.toString();
}
if (nextToken == JsonToken.END_ARRAY) {
return null;
}
throw new IllegalStateException("Expected start of object or end of array of objects");
}
@Override
public void close() throws IOException {
super.close();
parser.close();
}
}
}
是否有可能在没有所有这些看起来有点过分的额外类的情况下运行这个用例?