正则表达式 - 使用JAVA识别不定冠词“a”或“an”

时间:2012-02-22 19:47:28

标签: java regex article indefinite

我的任务是设计一个正则表达式,用于识别英文中的不定冠词 - 单词“a”或“an”,即写出正则表达式来标识单词a或单词an。我必须通过编写一个测试驱动程序来测试表达式,该驱动程序读取包含大约十行文本的文件。你的程序应该计算单词“a”和“an”的出现次数。我不能匹配字符a和a,如 an

到目前为止,这是我的代码:

import java.io.IOException;
import java.util.Arrays;
import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class RegexeFindText {
   public static void main(String[] args) throws IOException {

      // Input for matching the regexe pattern
       String file_name = "Testing.txt";

           ReadFile file = new ReadFile(file_name);
           String[] aryLines = file.OpenFile();  
           String asString = Arrays.toString(aryLines);

            // Regexe to be matched
               String regexe = ""; //<<--this is where the problem lies

           int i;
           for ( i=0; i < aryLines.length; i++ ) {
           System.out.println( aryLines[ i ] ) ;
           }


      // Step 1: Allocate a Pattern object to compile a regexe
      Pattern pattern = Pattern.compile(regexe);
      //Pattern pattern = Pattern.compile(regexe, Pattern.CASE_INSENSITIVE);  
      // case-        insensitive matching

      // Step 2: Allocate a Matcher object from the compiled regexe pattern,
      //         and provide the input to the Matcher
      Matcher matcher = pattern.matcher(asString);

      // Step 3: Perform the matching and process the matching result

      // Use method find()
      while (matcher.find()) {     // find the next match
         System.out.println("find() found the pattern \"" + matcher.group()
               + "\" starting at index " + matcher.start()
               + " and ending at index " + matcher.end());
      }

      // Use method matches()
      if (matcher.matches()) {
         System.out.println("matches() found the pattern \"" + matcher.group()
               + "\" starting at index " + matcher.start()
               + " and ending at index " + matcher.end());
      } else {
         System.out.println("matches() found nothing");
      }

      // Use method lookingAt()
      if (matcher.lookingAt()) {
         System.out.println("lookingAt() found the pattern \"" + matcher.group()
               + "\" starting at index " + matcher.start()
               + " and ending at index " + matcher.end());
      } else {
         System.out.println("lookingAt() found nothing");
      }
   }
}

我的问题是我在文本中找到这些单词的必要条件是什么? 非常感谢任何帮助,谢谢!

1 个答案:

答案 0 :(得分:3)

这是匹配“a”或“an”的正则表达式:

String regex = "\\ban?\\b";

让我们打破正则表达式:

  • \b表示单词边界(单个反斜杠在java中写为"\\"
  • a只是文字"a"
  • n?表示零或一个文字"n"