Question

我收到了一份来自服务器的文件，其中包含我的大学时间表并尝试从中提取数据。在某些文件中（对于某些部门），顶部有一个空白行，它是文件的第一行，所以我得到：

[Fatal Error] lesson:2:6: The processing instruction target matching "[xX][mM][lL]" is not allowed.

如何检查空行并将其删除到Java中的同一文件中？我无法通过字符串和行完成任何操作，因为XML文件通常不会在行尾有\n。

UPD

//it appeared on knt/151 file, so empty lines in the beginning of the file that caused fatal error
private void checkForEmptyLines(File f) {
    try {
        RandomAccessFile raf = new RandomAccessFile(f,"rw");
        while (raf.getFilePointer()!=raf.length()){
           //What should be here?
           Byte b = raf.readByte();
           if (b!=10)
               raf.write(b);
        }
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }


}

UPD xml文件处理：

public String[][] parse(String path)  {
    String[][] table = new String[8][6];

    File data = new File(path);
   // checkForEmptyLines(data);

    DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder dBuilder  = null;
    Document doc = null;

    try {
        dBuilder = dbFactory.newDocumentBuilder();
        doc = dBuilder.parse(data);
    } catch (SAXException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    } catch (ParserConfigurationException e) {
        e.printStackTrace();
    }

    doc.getDocumentElement().normalize();
    NodeList nodeList = doc.getElementsByTagName("Data");

    int rowIndex = 0;
    int columnIndex = 0;

    for (int i = 0; i < nodeList.getLength(); ++i) {
        if (i > 7 && !((i - 14) % 7 == 0)) { 
            Node node = nodeList.item(i);
            String line = node.getTextContent().replaceAll("\\t+", " "); 
            line = line.replace("\n", " ");

            if (columnIndex >= 6) {
                columnIndex = 0;
                ++rowIndex;
            }

            table[rowIndex][columnIndex++] = line;
        }
    }

XML文件example

Answer 1

对此没有快速而简单的答案，但足以说明您应该将输入视为流。我已经更新了“检查空行”方法，以基本上推进流，直到它到达第一个'＆lt;'然后字符重置流并切换处理

//it appeared on knt/151 file, so empty lines in the beginning of the file that caused fatal error
private void checkForEmptyLines(BufferedInputStream fs) throws IOException {
    // Set mark and allow for up to 1024 characters to be read before this mark becomes invalid
    fs.mark(1024);
    int ch;
    while( -1 != (ch = fs.read()) {
        if( '<' == ch ) {
            fs.reset();
            break;
        }
        else {
            fs.mark(1024);
        }
    }
}

public String[][] parse(String path)  {
    String[][] table = new String[8][6];

    File data = new File(path);
    FileInputStream dataStream= new FileInputStream(data);
    BufferedInputStream bufferedDataStream= new BufferedDataStream(dataStream, 1024);
    checkForEmptyLines(bufferedDataStream);

    DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder dBuilder  = null;
    Document doc = null;

    try {
        dBuilder = dbFactory.newDocumentBuilder();
        doc = dBuilder.parse(bufferedDataStream);
    } catch (SAXException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    } catch (ParserConfigurationException e) {
        e.printStackTrace();
    }

    doc.getDocumentElement().normalize();
    NodeList nodeList = doc.getElementsByTagName("Data");

    int rowIndex = 0;
    int columnIndex = 0;

    for (int i = 0; i < nodeList.getLength(); ++i) {
        if (i > 7 && !((i - 14) % 7 == 0)) { 
            Node node = nodeList.item(i);
            String line = node.getTextContent().replaceAll("\\t+", " "); 
            line = line.replace("\n", " ");

            if (columnIndex >= 6) {
                columnIndex = 0;
                ++rowIndex;
            }

            table[rowIndex][columnIndex++] = line;
        }
    }

Answer 2

我的同事已添加此代码，似乎可行。它不仅在开头检查空字符串，还删除它并将正确的数据写入新文件。

这个解决方案似乎很慢，如果可以做任何改进，请告诉我。

private static File skipFirstLine(File inputFile) {
    File outputFile = new File("skipped_" + inputFile.getName());

    try (BufferedReader reader = new BufferedReader(new FileReader(inputFile));
         BufferedWriter writer = new BufferedWriter(new FileWriter(outputFile))) {

        String line;
        int count = 0;
        while ((line = reader.readLine()) != null) {
            if (count == 0 && line.equals("")) {
                ++count;
                continue;
            }

            writer.write(line);
            writer.write("\n");
            ++count;
        }
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }

    return outputFile;
}

如果为空，则删除Java中XML文件中的第一行

2 个答案: