Question

我写了一个程序，用缓冲读取器读取文件并将数据存储在String变量中。如何修改它以使其跳过单行和多行注释？

这是我的代码：

import java.util.*;
import java.io.*;

public class IfCounter 
{
    public static void main(String[] args) throws IOException
    {
        // parameter the TA will pass in
        String fileName = args[0];

        // variable to keep track of number of if's
        int ifCount = 0;

        // create a new BufferReader
        BufferedReader reader = new BufferedReader( new FileReader (fileName));
        String line  = null;
        StringBuilder stringBuilder = new StringBuilder();
        String ls = System.getProperty("line.separator");

        // read from the text file
        while (( line = reader.readLine()) != null) 
        {
            stringBuilder.append(line);
            stringBuilder.append(ls);
        }

        // create a new string with stringBuilder data
        String tempString = stringBuilder.toString();

        // create one last string to look for our valid if(s) in
        // with ALL whitespace removed
        String compareString = tempString.replaceAll("\\s","");

        // check for valid if(s)
        for (int i = 0; i < compareString.length(); i++)
        {
            if (compareString.charAt(i) == ';' || compareString.charAt(i) == '}' || compareString.charAt(i) == '{') // added opening "{" for nested ifs :)
            {
                i++;

                if (compareString.charAt(i) == 'i')
                {
                    i++;

                    if (compareString.charAt(i) == 'f')
                    {
                        i++;

                        if (compareString.charAt(i) == '(')
                            ifCount++;
                    } // end if
                } // end if
            } // end if

        } // end for

        // print the number of valid "if(s) with a new line after"
        System.out.println(ifCount + "\n");

    } // end main
} // end class

Answer 1

改变这个：

    while (( line = reader.readLine()) != null) {
      stringBuilder.append(line);
      stringBuilder.append(ls);
    }

到此：

    boolean multiLineComment = false;
    while (( line = reader.readLine()) != null) {
      if (!isLineAMultiLineCommentStart(line)) {
        multiLineComment = true;
      }

      if (multiLineComment) {
        if (!isLineAMultiLineCommentEnd(line)) {
          multiLineComment = false;
        }
      }

      if (!isLineAComment(line) && !multiLineComment) {
        stringBuilder.append(line);
        stringBuilder.append(ls);
      }
    }

您需要创建一个布尔方法isLineAComment(String line)，isLineAMultiLineCommentStart和isLineAMultiLineCommentEnd，但这应该很容易。

Answer 2

您的问题没有说明输入语言是什么，没有它，就无法给出完整的答案。（例如，如果输入语言是Fortran IV，你只需要在第6列中找到'C'。这个答案是否满足你？）

一般的答案是准确的注释剥离通常要求您为输入语言实现（至少）部分词法分析器。例如，在Java中准确的评论剥离需要处理：

//在一行中间发表评论
/* ... */评论跨越多行
注释/或*字符表示为Unicode转义
//或/*或*/嵌入字符串文字

在那里有很多东西......

如果您实际上是在尝试分析Java源代码，那么更好的想法是使用现有的Java解析器/ AST分析框架。例如，PMD有一个很好的框架来做这种事情......我确信还有其他选择。

我如何使用缓冲区读取器读取文件但跳过使用java的注释

2 个答案: