我是Regex的新手,我正在努力找到解决问题的方法。我有一个包含多个条目的文件。这是一个例子:
1)你好我是等等等等等等。 Blah blah Building 5677 - Door 98 blah blah blah blah。
2)嗨,我的狗的名字是等等。建筑物36767&门898900等等等等。
3)嘿现在,等等等等建筑物345 DR 898.Blah Blah blag Building 333 - Door 89797 blah。
我需要从每一行中提取建筑物编号和门编号的每个实例。每个条目中唯一不变的模式是:
1)“建筑”一词始终存在。
2)“建筑物”后面总是跟着一组整数......字母“D | d”......和第二组整数(后面跟一个非整数)。
我想要的只是提取建筑物编号和门号并打印到控制台,但我无法将其转换为正则表达式模式。我正在使用Java。
答案 0 :(得分:2)
我认为这应该有效:
Building.+?(\d+).+?[Dd].+?(\d+)
您的号码将在第1组和第2组中。
Building //start by matching "Building"
.+? //then skip over the least number of characters that allows the match
(\d+) //then read as many digits as possible and put them in group one
.+? //then skip over the least number of characters that allows the match
[Dd] //then match an upper- or lower-case 'D'
.+? //then skip over the least number of characters that allows the match
(\d+) //then read as many digits as possible and put them in group two
所以在Java中:
Pattern pat = Pattern.compile("Building.+?(\\d+).+?[Dd].+?(\\d+)");
Matcher matcher =
pat.matcher("Hello my is blah blah blah. Blah blah Building 5677 - Door 98 blah blah blah. ");
if (matcher.find()) {
System.out.println(matcher.group(1));
System.out.println(matcher.group(2));
}
要从一个输入中提取多个数字集,如第三个示例所示,您可以使用
while (matcher.find()) {
而不是仅使用if
找到它一次。
答案 1 :(得分:0)
正则表达式查找建筑物编号 -
(?<=Building\\s)[0-9]+
门号相同 -
(?<=Door\\s)[0-9]+
把它放在一起 -
public static void main(String[] args) {
String inputStr = "Hello my is blah blah blah. Blah blah Building 5677 - Door 98 blah blah blah";
Pattern patternBuilding = Pattern.compile("(?<=Building\\s)[0-9]+");
Pattern patternDoor = Pattern.compile("(?<=Door\\s)[0-9]+");
Matcher matcherBuilding = patternBuilding.matcher(inputStr);
Matcher matcherDoor = patternDoor.matcher(inputStr);
if (matcherBuilding.find())
System.out.println("Building number is " + matcherBuilding.group());
if (matcherDoor.find())
System.out.println("Door number is " + matcherDoor.group());
}