我的任务是设计一个正则表达式,用于识别英文中的不定冠词 - 单词“a”或“an”,即写出正则表达式来标识单词a或单词an。我必须通过编写一个测试驱动程序来测试表达式,该驱动程序读取包含大约十行文本的文件。你的程序应该计算单词“a”和“an”的出现次数。我不能匹配字符a和a,如 an 。
到目前为止,这是我的代码:
import java.io.IOException;
import java.util.Arrays;
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class RegexeFindText {
public static void main(String[] args) throws IOException {
// Input for matching the regexe pattern
String file_name = "Testing.txt";
ReadFile file = new ReadFile(file_name);
String[] aryLines = file.OpenFile();
String asString = Arrays.toString(aryLines);
// Regexe to be matched
String regexe = ""; //<<--this is where the problem lies
int i;
for ( i=0; i < aryLines.length; i++ ) {
System.out.println( aryLines[ i ] ) ;
}
// Step 1: Allocate a Pattern object to compile a regexe
Pattern pattern = Pattern.compile(regexe);
//Pattern pattern = Pattern.compile(regexe, Pattern.CASE_INSENSITIVE);
// case- insensitive matching
// Step 2: Allocate a Matcher object from the compiled regexe pattern,
// and provide the input to the Matcher
Matcher matcher = pattern.matcher(asString);
// Step 3: Perform the matching and process the matching result
// Use method find()
while (matcher.find()) { // find the next match
System.out.println("find() found the pattern \"" + matcher.group()
+ "\" starting at index " + matcher.start()
+ " and ending at index " + matcher.end());
}
// Use method matches()
if (matcher.matches()) {
System.out.println("matches() found the pattern \"" + matcher.group()
+ "\" starting at index " + matcher.start()
+ " and ending at index " + matcher.end());
} else {
System.out.println("matches() found nothing");
}
// Use method lookingAt()
if (matcher.lookingAt()) {
System.out.println("lookingAt() found the pattern \"" + matcher.group()
+ "\" starting at index " + matcher.start()
+ " and ending at index " + matcher.end());
} else {
System.out.println("lookingAt() found nothing");
}
}
}
我的问题是我在文本中找到这些单词的必要条件是什么? 非常感谢任何帮助,谢谢!
答案 0 :(得分:3)
这是匹配“a”或“an”的正则表达式:
String regex = "\\ban?\\b";
让我们打破正则表达式:
\b
表示单词边界(单个反斜杠在java中写为"\\"
)a
只是文字"a"
n?
表示零或一个文字"n"