我有一个超过2GB的二进制文件,其中有数百万个(可变长度)块,序列字节为0xE8和0x35。 询问,调查和尝试我能够得到下面的代码(我不知道是否有效率 执行时间)。代码的目标是一次处理每个块,我正在尝试这样做:
执行这3个步骤,现在我可以处理前1024个字节中的每个块,但是要处理我需要开始的第2个1024个字节 在前1024个字节的最后一个分隔符之前,因为前一个1024字节的最后一个块可能未完成,因此我存储 最后的分隔符位置“ hexstr.lastIndexOf(”E8F5“)”但我不知道如何存储下一个1024字节的开头 从最后的分隔符位置。我正在尝试“ input.read(inputBytes,lastPos,1024)”但我收到了错误。
Exception in thread "main" java.lang.IndexOutOfBoundsException
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:246)
at ReadBinaryWithDelimiter.ReadBinaryWithDelimiter.main(ReadBinaryWithDelimiter.java:19)
如何存储1024个字节,从每个迭代的最后一个分隔符位置开始?
这是我到目前为止的代码:
package ReadBinaryWithDelimiter;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import java.util.Scanner;
import javax.xml.bind.DatatypeConverter;
public class ReadBinaryWithDelimiter {
public static void main(String[] args) throws FileNotFoundException {
File inputFile = new File("./binary");
int lastPos = 0; //To store position of last delimiter
try (InputStream input = new FileInputStream(inputFile)) {
for(int i=1; i<3; i++){ //Loop to read more than one chunk
byte inputBytes[] = new byte[1024];
int readBytes = input.read(inputBytes); // Storing 1024 bytes in "inputBytes"
//Converting to a string the Hexadecimal content of the variable "inputBytes"
String hexstr=DatatypeConverter.printHexBinary(inputBytes);
lastPos=hexstr.lastIndexOf("E8F5")-2; //Storing position of last delimiter
//Replacing all delimiters in "inputBytes" with \r in order to process each chunk
String str = hexstr.replaceAll("E8F5", "\r");
//Now process each "line" (chunk) since they are separated with \r
try (Scanner scanner = new Scanner(str)) {
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
// process the line
System.out.println(line);
}
}
}
}
catch (FileNotFoundException ex) {System.err.println("Couldn't read file: " + ex);}
catch (IOException ex) {System.err.println("Error while reading file: " + ex);}
}
}
感谢您的帮助。