在Java中写入文件内存限制?

时间:2016-04-25 16:45:53

标签: java memory io

我正在尝试编写一个过滤数据的程序。该数据包含27,000行,超过150mb。无论我如何尝试实现该功能,它都会在4,300线附近过早停止打印。我测试了循环而没有打印数据(只打印行号),它达到了完整的27,000行。我认为这可能是一个内存问题,但由于我是Java的新手,我不太确定问题出在哪里。现在的两个主要嫌疑人是line.substring和PrintStream类。请帮忙!

public static void main(String[] args) {
  // tries to print output to output.csv in same directory
  try {
     PrintStream out = new PrintStream(new FileOutputStream("output.csv"));
     System.setOut(out);
  }
  catch(IOException e1) {
    System.out.println("Error during reading/writing");
  }

  // read input file
  File inputFile = new File("my-large-file.txt");

  if(!inputFile.canRead()) {
     System.out.println("Required input file not found; exiting.");
     System.exit(1);
  }

  // doesn't allow me to use scanner without try for some reason
  try {
     Scanner input = new Scanner(inputFile);

     while (input.hasNextLine()) {
        String line = input.nextLine();

        // scan through each line
        Scanner lineScan = new Scanner(line);

        // if we find the line that we want to look through
        if(lineScan.next().startsWith("1")) {

           // prints the specific data to output
           String a= line.substring(007, 666);         
           if (!(a== "the-number-that-I-don't-want")) {
              String current         = line.substring(1, 10);
              String another         = line.substring(10, 20).replaceAll("\\s+","");
              String third           = line.substring(20, 30).replaceAll("\\s +","");
              String fourth          = line.substring(40, 50);
              ...
              String nth             = line.substring(999, 1000);


              System.out.print(current + ", ");
              System.out.print(another + ", ");
              System.out.print(third + ", ");
              System.out.print(fourth + ", ");
              ...
              System.out.print(nth);
              System.out.println();

           }
        }
     }
   }
  catch(IOException e) {
     e.printStackTrace();
  } 

}

2 个答案:

答案 0 :(得分:0)

String.substring需要有效的索引。字符串之间的比较使用equals

  if (line.length() >= 666) { // Or even 1000
      String a = line.substring(007, 666);         
      if (!a.equals("the-number-that-I-don't-want")) {
      ...
  }

然后你应该关闭所有打开的东西。 lineScan,尤其是input

在这种情况下,BufferedReader可能比分割令牌的Scanner更直观。 BufferedReader更简单,而且可能更快。

答案 1 :(得分:0)

我能够弄明白!谢谢你们指点我正确的方向。

我的程序的问题是我在内存中存储太多。我将每行存储在我的文件中,然后存储另一个扫描程序以扫描整行,存储字符串,连接字符串等。

使用StringBuffer而不是String,因为它们在进行连接时会提高性能。

以下是我修改后的解决方案,现在可以按预期运行文件和过滤器:

 public static void main(String[] args) throws IOException {
  FileInputStream inputStream = null;
  Scanner sc = null;
  try {
     PrintStream out = new PrintStream(new FileOutputStream("output.csv"));
     System.setOut(out);
  }
  catch(IOException e1) {
    System.out.println("Error during reading/writing");
  }
  try {
      inputStream = new FileInputStream("my-large-file.txt");
      sc = new Scanner(inputStream, "UTF-8");
      while (sc.hasNextLine()) {
        String line = sc.nextLine();

        // note the specific indecies of the substring are random nums, and does not affect the program. They could be anything.
        if (!line.startsWith("the-number-that-I-don't-want"))) {
           String filter2 = line.substring(55, 66);         
           if (!(filter2.equals("another-string-to-filter-out"))) {
              StringBuffer current     = new StringBuffer(line.substring(1, 10));
              StringBuffer another     = new StringBuffer(line.substring(10, 20).replaceAll("\\s+",""));
              StringBuffer third       = new StringBuffer(line.substring(22, 37).replaceAll("\\s +",""));
              StringBuffer fourth      = new StringBuffer(line.substring(37, 56));

              ...
              StringBuffer nth         = new StringBuffer(line.substring(999, 1000));

              System.out.println(currentS + ", " + firstName + ", " + lastName + ", " + birthday + ", " + distributedAmt + ", " +awardYear + ", " + transactionNum + ", " + disbursementDate + ", " + efc + ", " + percentEligUsed + ", " + grantType);
           }
        }
     }

     if (sc.ioException() != null) {
        throw sc.ioException();
     }
  } finally {
     if (inputStream != null) {
        inputStream.close();
     }
     if (sc != null) {
        sc.close();
     }

  }                                                                              
}

此链接帮助了我很多:http://www.baeldung.com/java-read-lines-large-file