我正在开发一个检查和更新Oracle数据库数据的流程。我在我的应用程序中使用了hibernate和spring框架。
应用程序读取csv文件,处理内容,然后持久化实体:
public class Main() {
Input input = ReadCSV(path);
EntityList resultList = Process.process(input);
WriteResult.write(resultList);
...
}
// Process class that loops over input
public class Process{
public EntityList process(Input input) :
EntityList results = ...;
...
for(Line line : input.readLine()){
results.add(ProcessLine.process(line))
...
}
return results;
}
// retrieving and updating entities
Class ProcessLine {
@Autowired
DomaineRepository domaineRepository;
@Autowired
CompanyDomaineService companydomaineService
@Transactional
public MyEntity process(Line line){
// getcompanyByXX is CrudRepository method with @Query that returns an entity object
MyEntity companyToAttach = domaineRepository.getCompanyByCode(line.getCode());
MyEntity companyToDetach = domaineRepository.getCompanyBySiret(line.getSiret());
if(companyToDetach == null || companyToAttach == null){
throw new CustomException("Custom Exception");
}
// AttachCompany retrieves some entity relationEntity, then removes companyToDetach and adds CompanyToAttach. this updates relationEntity.company attribute.
companydomaineService.attachCompany(companyToAttach, companyToDetach);
return companyToAttach;
}
}
public class WriteResult{
@Autowired
DomaineRepository domaineRepository;
@Transactional
public void write(EntityList results) {
for (MyEntity result : results){
domaineRepository.save(result)
}
}
}
应用程序适用于几行文件,但是当我尝试处理大文件(200 000行)时,性能会急剧下降,并且我会得到一个SQL超时。 我怀疑缓存问题,但我想知道在处理结束时保存所有实体是不是一个坏习惯?
答案 0 :(得分:0)
问题是你的for循环正在对结果进行单独保存,因此单个插入会减慢它的速度。 Hibernate和spring支持批量插入,应尽可能完成。
类似于domaineRepository.saveAll(results)
由于您正在处理大量数据,因此最好分批处理,因此不应让一家公司附加您应该获得一个公司列表来附加流程,然后获取公司列表以分离和处理这些< / p>
public EntityList process(Input input) :
EntityList results;
List<Code> companiesToAdd = new ArrayList<>();
List<Siret> companiesToRemove = new ArrayList<>();
for(Line line : input.readLine()){
companiesToAdd.add(line.getCode());
companiesToRemove.add(line.getSiret());
...
}
results = process(companiesToAdd, companiesToRemove);
return results;
}
public MyEntity process(List<Code> companiesToAdd, List<Siret> companiesToRemove) {
List<MyEntity> attachList = domaineRepository.getCompanyByCodeIn(companiesToAdd);
List<MyEntity> detachList = domaineRepository.getCompanyBySiretIn(companiesToRemove);
if (attachList.isEmpty() || detachList.isEmpty()) {
throw new CustomException("Custom Exception");
}
companydomaineService.attachCompany(attachList, detachList);
return attachList;
}
上面的代码只是sudo代码,指出你正确的方向,需要找出适合你的方法。
答案 1 :(得分:0)
对于您阅读的每一行,您都在这里进行2次读取操作
MyEntity companyToAttach = domaineRepository.getCompanyByCode(line.getCode()); MyEntity companyToDetach = domaineRepository.getCompanyBySiret(line.getSiret());
您可以阅读多行和我们的查询,然后处理该公司列表