我收到了一份来自服务器的文件,其中包含我的大学时间表并尝试从中提取数据。在某些文件中(对于某些部门),顶部有一个空白行,它是文件的第一行,所以我得到:
[Fatal Error] lesson:2:6: The processing instruction target matching "[xX][mM][lL]" is not allowed.
如何检查空行并将其删除到Java中的同一文件中?我无法通过字符串和行完成任何操作,因为XML文件通常不会在行尾有\n
。
UPD
//it appeared on knt/151 file, so empty lines in the beginning of the file that caused fatal error
private void checkForEmptyLines(File f) {
try {
RandomAccessFile raf = new RandomAccessFile(f,"rw");
while (raf.getFilePointer()!=raf.length()){
//What should be here?
Byte b = raf.readByte();
if (b!=10)
raf.write(b);
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
UPD xml文件处理:
public String[][] parse(String path) {
String[][] table = new String[8][6];
File data = new File(path);
// checkForEmptyLines(data);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = null;
Document doc = null;
try {
dBuilder = dbFactory.newDocumentBuilder();
doc = dBuilder.parse(data);
} catch (SAXException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} catch (ParserConfigurationException e) {
e.printStackTrace();
}
doc.getDocumentElement().normalize();
NodeList nodeList = doc.getElementsByTagName("Data");
int rowIndex = 0;
int columnIndex = 0;
for (int i = 0; i < nodeList.getLength(); ++i) {
if (i > 7 && !((i - 14) % 7 == 0)) {
Node node = nodeList.item(i);
String line = node.getTextContent().replaceAll("\\t+", " ");
line = line.replace("\n", " ");
if (columnIndex >= 6) {
columnIndex = 0;
++rowIndex;
}
table[rowIndex][columnIndex++] = line;
}
}
XML文件example
答案 0 :(得分:0)
对此没有快速而简单的答案,但足以说明您应该将输入视为流。我已经更新了“检查空行”方法,以基本上推进流,直到它到达第一个'&lt;'然后字符重置流并切换处理
//it appeared on knt/151 file, so empty lines in the beginning of the file that caused fatal error
private void checkForEmptyLines(BufferedInputStream fs) throws IOException {
// Set mark and allow for up to 1024 characters to be read before this mark becomes invalid
fs.mark(1024);
int ch;
while( -1 != (ch = fs.read()) {
if( '<' == ch ) {
fs.reset();
break;
}
else {
fs.mark(1024);
}
}
}
public String[][] parse(String path) {
String[][] table = new String[8][6];
File data = new File(path);
FileInputStream dataStream= new FileInputStream(data);
BufferedInputStream bufferedDataStream= new BufferedDataStream(dataStream, 1024);
checkForEmptyLines(bufferedDataStream);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = null;
Document doc = null;
try {
dBuilder = dbFactory.newDocumentBuilder();
doc = dBuilder.parse(bufferedDataStream);
} catch (SAXException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} catch (ParserConfigurationException e) {
e.printStackTrace();
}
doc.getDocumentElement().normalize();
NodeList nodeList = doc.getElementsByTagName("Data");
int rowIndex = 0;
int columnIndex = 0;
for (int i = 0; i < nodeList.getLength(); ++i) {
if (i > 7 && !((i - 14) % 7 == 0)) {
Node node = nodeList.item(i);
String line = node.getTextContent().replaceAll("\\t+", " ");
line = line.replace("\n", " ");
if (columnIndex >= 6) {
columnIndex = 0;
++rowIndex;
}
table[rowIndex][columnIndex++] = line;
}
}
答案 1 :(得分:0)
我的同事已添加此代码,似乎可行。它不仅在开头检查空字符串,还删除它并将正确的数据写入新文件。
这个解决方案似乎很慢,如果可以做任何改进,请告诉我。
private static File skipFirstLine(File inputFile) {
File outputFile = new File("skipped_" + inputFile.getName());
try (BufferedReader reader = new BufferedReader(new FileReader(inputFile));
BufferedWriter writer = new BufferedWriter(new FileWriter(outputFile))) {
String line;
int count = 0;
while ((line = reader.readLine()) != null) {
if (count == 0 && line.equals("")) {
++count;
continue;
}
writer.write(line);
writer.write("\n");
++count;
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return outputFile;
}