如何在java中有效地阅读大文本文件

时间:2015-01-21 10:13:17

标签: java file

在这里,我正在读取18 MB文件并将其存储在二维数组中。但是这个程序需要将近15分钟才能运行。无论如何都要优化程序的运行时间。该文件仅包含二进制值。提前谢谢......

public class test 
{
    public static void main(String[] args) throws FileNotFoundException, IOException 
    {
        BufferedReader br;

        FileReader fr=null;
        int m = 2160;
        int n = 4320;
        int[][] lof = new int[n][m];
        String filename = "D:/New Folder/ETOPOCHAR";
       try {
         Scanner input = new Scanner(new File("D:/New Folder/ETOPOCHAR"));
        double range_km=1.0;
        double alonn=-57.07; //180 to 180
        double alat=38.53;

        while (input.hasNextLine()) {
            for (int i = 0; i < m; i++) {
                for (int j = 0; j < n; j++) {
                   try
                   {
                      lof[j][i] = input.nextInt();
                      System.out.println("value[" + j + "][" + i + "] = "+ lof[j][i]);
                    }
                   catch (java.util.NoSuchElementException e) {
                      //  e.printStackTrace();
                    }
                }
            }         //print the input matrix
        }

我也尝试过使用字节数组,但我无法将其保存在二维数组中......

public class FileToArrayOfBytes
{
    public static void main( String[] args )
    {
        FileInputStream fileInputStream=null;

        File file = new File("name of file");

        byte[] bFile = new byte[(int) file.length()];

        try {
            //convert file into array of bytes
        fileInputStream = new FileInputStream(file);
        fileInputStream.read(bFile);
        fileInputStream.close();

        for (int i = 0; i < bFile.length; i++) {
            System.out.print((char)bFile[i]);
            }

        System.out.println("Done");
        }catch(Exception e){
            e.printStackTrace();
        }
    }
}

1 个答案:

答案 0 :(得分:0)

您可以先将文件读入字节数组,然后对这些字节进行反序列化。从2048字节缓冲区(作为输入缓冲区)开始,然后通过增加/减小其大小进行实验,但实验缓冲区大小值应为2的幂(512,1024,2048等)。

据我所知,很有可能使用大小为2048字节的缓冲区来获得最佳性能,但它取决于操作系统,应该进行验证。

代码示例(这里你可以尝试BUFFER_SIZE变量的不同值,在我的情况下,我在不到一秒的时间内读取了一个大小为7.5M的测试文件):

public static void main(String... args) throws IOException {
    File f = new File(args[0]);
    byte[] buffer = new byte[BUFFER_SIZE];
    ByteBuffer result = ByteBuffer.allocateDirect((int) f.length());
    try (FileInputStream fos = new FileInputStream(f)) {
      int bytesRead;
      int totalBytesRead = 0;
      while ((bytesRead = fos.read(buffer, 0, BUFFER_SIZE)) != -1) {
        result.put(buffer, 0, bytesRead);
        totalBytesRead += bytesRead;
      }
      // debug info
      System.out.printf("Read %d bytes\n", totalBytesRead);

      // Here you can do whatever you want with the result, including creation of a 2D array...
      int pos = result.position();
      result.rewind();
      for (int i = 0; i < pos / 4; i++) {
        System.out.println(result.getInt());
      }
    }
  }

花点时间阅读java.io,java.nio包以及Scanner类的文档,以增进理解。