我正在使用Java 8 Stream类来读取大约500Mb的.csv文件,除了我找到的2个实例外,几乎所有数据的格式都相同。我存储在ArrayList中的每个对象有52行,然后将它们添加到HashMap,以便我可以根据键访问它们。我使用HashMap使用不同的类为每个对象创建一个excel文件,然后在创建文件后立即清除List并转到另一个对象。问题是,当涉及到数量较少的行时,excel创建类会尝试从不存在的索引中获取数字,这会抛出NullPointerException。如果抛出NullPointerException,有没有办法跳过这些行?我知道如果出现这个问题,我必须跳过52行。
try
{
final String regex = "\\d*\\.?\\d+";
Stream<String> lines = Files.lines( file, StandardCharsets.UTF_8 );
for( String line : (Iterable<String>) lines.skip(currentLine)::iterator ){
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(line.substring(0));
while (matcher.find()) {
testPop.add(Double.parseDouble(matcher.group(0)));
}
currentLine++;
if(currentLine%52==0) {
for(int i =0;i<52;i++) {
int date=4+29*i;
int a=13+29*i;
int b=6+29*i;
int c=15+29*i;
int d=16+29*i;
int e=8+29*i;
int f=17+29*i;
int g=14+29*i;
int h=7+29*i;
WeeklyCalculations.put(Integer.parseInt(String.valueOf((int)((testPop.get(date))/1))),new Calculations(testPop.get(a),3,1,testPop.get(b),testPop.get(c),testPop.get(d),testPop.get(e),testPop.get(f),testPop.get(g),testPop.get(h),testPop.get(date),WeeklyCalculations));
}
findZeroStockOuts();
ExcelCreator x = new ExcelCreator(WeeklyCalculations,String.valueOf(((int)(testPop.get(1)/1))),String.valueOf(((int)(testPop.get(2)/1))), noStockouts, stockOuts);
x.createExcel();
testPop.clear();
WeeklyCalculations.clear();
counter++;
System.out.println(counter + "/" + "67101 - "+TimeUnit.SECONDS.convert(System.nanoTime(), TimeUnit.NANOSECONDS));
}
}
} catch (IOException ioe){
ioe.printStackTrace();
}
catch(NullPointerException x) {
readToExcel(currentLine+52);
}
我能够在循环中跳过它们,但是这会大大降低速度,考虑到它的大约350万行,并且它必须在每次迭代后跳过所有这些行。有没有一种有效的方法呢?
答案 0 :(得分:0)
之所以慢是因为你反复从头开始读取文件。在下面的块中填写您的代码:
final String regex = "\\d*\\.?\\d+";
final Pattern pattern = Pattern.compile(regex);
try (Stream<String> lines = Files.lines(file, StandardCharsets.UTF_8)) {
final Iterator<String> iter = lines.iterator();
for (int currentLine = 1; iter.hasNext(); currentLine++) {
String line = iter.next();
final Matcher matcher = pattern.matcher(line); // No reason do: line.substring(0)
while (matcher.find()) {
// testPop.add(Double.parseDouble(matcher.group(0)));
}
try {
if (currentLine % 52 == 0) {
for (int i = 0; i < 52; i++) {
// TODO
}
}
// TODO:
} catch (IOException ioe) {
ioe.printStackTrace();
while (currentLine % 52 != 0 && iter.hasNext()) {
iter.next();
currentLine++;
}
} catch (NullPointerException x) {
// readToExcel(currentLine + 52);
while (currentLine % 52 != 0 && iter.hasNext()) {
iter.next();
currentLine++;
}
}
}
}
以下是Fork of StreamEx:
简化代码的方法final String regex = "\\d*\\.?\\d+";
final Pattern pattern = Pattern.compile(regex);
try (StreamEx<String> stream = StreamEx.ofLines(file, StandardCharsets.UTF_8)) {
stream.splitToList(52).filter(l -> l.size() == 52).forEach(lines -> {
lines.stream().forEach(line -> {
final Matcher matcher = pattern.matcher(line); // No reason do: line.substring(0)
while (matcher.find()) {
// testPop.add(Double.parseDouble(matcher.group(0)));
}
});
try {
// TODO:
} catch (IOException ioe) {
ioe.printStackTrace();
} catch (NullPointerException x) {
// readToExcel(currentLine + 52);
}
});
} catch (IOException e) {
e.printStackTrace();
}