Question

HY，

我在阅读文件时遇到了逻辑上的麻烦，一行一行。我知道你可以用BufferedReader做到这一点，但有时候我会有更多行写的“值”，这很重要。

正在阅读的文件示例：

   <#FIELD NAME = DESC> Some text that goes

        over multiple lines

        which is needed</#FIELD>

    <#FIELD NAME = TEMP> some values are just a single line</#FIELD>

我需要解析字段名称，如上面的TEMP或DESC，然后在这些括号<#FIELD NAME =DESC>important values </#FIELD>之间提取值。但是我不确定如何“识别”条目具有多行值或单个值行，然后在使用BufferedReader时将其保存到变量。

我真的很感激任何提示或示例指引我走向正确的方向！

因为阅读它符合行，没有帮助我进步...我不会发布整个代码，因为我认为有更简单的方法来阅读它，你会知道我做了什么远离这个小片段。

if (line.contains("<#FIELD NAME = AUTOR>"))
{
    String autor = line.substring(line.indexOf(">") + 1, line.indexOf("</#"));
    metaData.setAutor(autor.trim());
}
else if (line.contains("<#FIELD NAME = DOKUMENTNR>"))
{
    String dokumentnr = line.substring(line.indexOf(">") + 1, line.indexOf("</#"));
    metaData.setDoukumentnr(dokumentnr.trim());
    ...

Answer 1

while((line=reader.readLine()) != null){
    if(isDescOrTemp(line)){
        if(line.endsWith("</#FIELD>"){
           //one line field
        } else
        while(!line.endsWith("</#FIELD>"){
            //read more lines
            line=reader.readLine();
            //store line somewhere
        }
    }
}

Answer 2

根据我的理解，如果您没有分层数据（如树），则意味着您有一个列表，因此您正在寻找一种分割它的方法。通常你应该写一个干净的解析器，但如果不是这样的话，你可以试着破解你的方式。

String s = "<#FIELD NAME = DESC> Some text that goes\nover multiple lines\nwhich is needed</#FIELD>\n<#FIELD NAME = TEMP> some values are just a single line</#FIELD>";
String[] fs = s.split("<#FIELD NAME = ");
for (String f : fs) {
    System.out.println(f);
}

产生

DESC> Some text that goes
over multiple lines
which is needed</#FIELD>

TEMP> some values are just a single line</#FIELD>

在此之后，您需要通过删除结尾处的</#FIELD>并在开头读取密钥来清理结果字符串。

Answer 3

尝试类似

的内容

public string ReadField(BufferedReader reader) 
{
    string line = reader.readLine();
    while (line.indexOf("</#FIELD>") == -1)
    {
        line += reader.readLine(); // This does not preserve line breaks
    }

    return line;
}

在原始代码中，类似

string line = ReadField(myReader); // This reads up to the next field

if(line.contains("<#FIELD NAME = AUTOR>")){
   String autor = line.substring(line.indexOf(">")+1,line.indexOf("</#"));
   metaData.setAutor(autor.trim());
} else if(line.contains("<#FIELD NAME = DOKUMENTNR>")) {
   String dokumentnr = line.substring(line.indexOf(">")+1,line.indexOf("</#"));
   metaData.setDoukumentnr(dokumentnr.trim());
}

Answer 4

您可以按照下面的伪代码进行操作：

While(!EOF)
{
    string line = readLine();
    while( !EOF && ! line.contains("</#FIELD>"))
    {
        line += readLine();
    }
    // Here you get a line with matching `begin` and `end`

    // ... do operations as needed

    // reset line
    line = "";
}

Answer 5

我的版本：

    Pattern p = Pattern.compile("<#FIELD.+</#FIELD>", Pattern.DOTALL);
    Scanner s = new Scanner(new File("test.txt"));
    for(;;) {
        String field = s.findWithinHorizon(p, 0);
        if (field == null) {
            break;
        }
        // here you got a full #FIELD element, parse it
        System.out.println(field);
    }

通过多行读取java中的文件

5 个答案: