我有700个csv文件,每个文件的大小约为5-7 MB。我正在使用Spring Boot。我需要做的就是阅读这700个CSV文件。因此,无论何时将新文件添加到目录中,都将调用fileUpdatedOrAdded()
中的方法FileWatcherJob.java
。它会进行一些检查,只是试图从本质上读取文件。
FileWatcherJob.java
public class FileWatcherJob implements DirectoryScanListener {
private final static Logger logger = LoggerFactory.getLogger(FileWatcherJob.class);
public static final String LISTENER_NAME = "DirScanListenerName";
private static boolean fileFound = true;
private ReadFile readFile = new ReadFile();
public void filesUpdatedOrAdded(File[] files) {
if (fileFound) {
System.out.println("------------- I am doing it again-------------");
for (File file : files) {
logger.info("File Found : {}", file.getName());
}
logger.info("ALL THE FILES ARE AVAILABLE NOW");
if (!readFile.getFileAStored()) {
readFile.readAllFiles("D:\\FileToRead\\fileA.csv");
}
if (!readFile.getFileBStored()) {
readFile.readAllFiles("D:\\FileToRead\\fileB.csv");
}
//Read Miscallenous Files including File A and File B
if (readFile.getFileAStored() && readFile.getFileBStored()) {
readFile.readAllFiles("D:\\FileToRead\\");
}
fileFound = false;
logger.info("-------------- I am done -----------------");
}
}
}
ReadFile.java
public class ReadFile {
private static final Logger LOGGER = LoggerFactory.getLogger(ReadFile.class);
private Map<Path, List<String>> fileA = new HashMap<>();
private Map<Path, List<String>> fileB = new HashMap<>();
private Boolean fileAStored = false;
private Boolean fileBStored = false;
private Map<Path, List<String>> miscallenousFiles = new HashMap<>();
public Boolean getFileAStored() {
return fileAStored;
}
public Boolean getFileBStored() {
return fileBStored;
}
public void readAllFiles(String path) {
try (Stream<Path> paths = Files.walk(Paths.get(path)).collect(toList()).parallelStream()
){
paths.forEach(filePath -> {
//LOGGER.info("CHECK IF FILE IS REGULAR");
if (filePath.toFile().exists()) {
String fileName = filePath.getFileName().toString();
try {
LOGGER.info("START LOADING THE CONTENT OF FILE " + fileName);
List<String> loadedFile = readContent(filePath);
storeAandBFiles(fileName, filePath, loadedFile);
} catch (Exception e) {
LOGGER.info("ERROR WHILE READING THE CONTENT OF FILE");
LOGGER.error(e.getMessage());
}
}
});
} catch (IOException e) {
LOGGER.info("ERROR WHILE READING THE FILES IN PARALLEL");
LOGGER.error(e.getMessage());
}
}
private List<String> readContent(Path filePath) throws IOException {
//LOGGER.info("START READING THE FILE, LINE BY LINE");
return Files.readAllLines(filePath, StandardCharsets.ISO_8859_1);
}
private void storeAandBFiles(String fileName, Path filePath, List<String> loadedFile) {
//LOGGER.info("START STORING THE FILE");
if (fileName.contains("fileA") && !fileAStored) {
fileA.put(filePath.getFileName(), loadedFile);
fileAStored = true;
}
if (fileName.contains("fileB") && !fileBStored) {
fileB.put(filePath.getFileName(), loadedFile);
fileBStored = true;
}
}
}
但是,我一直收到以下错误:
Job group1.FileScanJobName引发了未处理的异常:
java.lang.OutOfMemoryError:Java堆空间
我不明白问题是什么。有人可以帮忙吗? 一件事很奇怪,我怀疑是问题的原因,即使没有新文件添加到目录中,观察者仍然以某种方式说在目录中找到了新文件!