我想在我的java程序中使用正则表达式来识别我的字符串的一些功能。 我有这种类型的字符串:
`-Author-写了(-hh - : - mm - )
所以,例如,我有一个字符串:
Cecco写过(15:12)
我要提取作者,hh和mm字段。显然我有一些限制需要考虑:
hh and mm must be numbers
author hasn't any restrictions
I've to consider space between "has wrote" and (
我不知道如何使用正则表达式,你能帮助我吗?
编辑:我附上我的代码: String mRegex = "(\\s)+ has wrote \\((\\d\\d):(\\d\\d)\\)";
Pattern mPattern = Pattern.compile(mRegex);
String[] str = {
"Cecco CQ has wrote (14:55)", //OK (matched)
"yesterday you has wrote that I'm crazy", //NO (different text)
"Simon has wrote (yesterday)", // NO (yesterday isn't numbers)
"John has wrote (22:32)", //OK
"James has wrote(22:11)", //NO (missed space between has wrote and ()
"Tommy has wrote (xx:ss)" //NO (xx and ss aren't numbers)
};
for(String s : str) {
Matcher mMatcher = mPattern.matcher(s);
while (mMatcher.find()) {
System.out.println(mMatcher.group());
}
}
答案 0 :(得分:2)
功课?
类似的东西:
(.+) has wrote \((\d\d):(\d\d)\)
应该做的伎俩
()
- 标记要捕获的组(上面有三个).+
- 任何字符(你说没有限制)\d
- 任意数字\(\)
以文字而不是捕获组的形式逃脱了parens 使用:
Pattern p = Pattern.compile("(.+) has wrote \\((\\d\\d):(\\d\\d)\\)");
Matcher m = p.matcher("Gareth has wrote (12:00)");
if( m.matches()){
System.out.println(m.group(1));
System.out.println(m.group(2));
System.out.println(m.group(3));
}
要在最后处理一个可选的(HH:mm)你需要开始使用一些黑暗的正则表达巫术:
Pattern p = Pattern.compile("(.+) has wrote\\s?(?:\\((\\d\\d):(\\d\\d)\\))?");
Matcher m = p.matcher("Gareth has wrote (12:00)");
if( m.matches()){
System.out.println(m.group(1));
System.out.println(m.group(2));
System.out.println(m.group(3));
}
m = p.matcher("Gareth has wrote");
if( m.matches()){
System.out.println(m.group(1));
// m.group(2) == null since it didn't match anything
}
新的非转义模式:
(.+) has wrote\s?(?:\((\d\d):(\d\d)\))?
\s?
可选地匹配空格(如果没有(HH:mm)组,则最后可能没有空格(?: ... )
是一个无捕获组,即允许使用?
之后将其设为可选我认为@codinghorror有something to say about regex
答案 1 :(得分:1)
找出正则表达式的最简单方法是在编码之前使用测试工具 我使用来自http://www.brosinski.com/regex/
的eclipse插件使用这个我想出了以下结果:
([a-zA-Z]*) has wrote \((\d\d):(\d\d)\)
Cecco has wrote (15:12)
Found 1 match(es):
start=0, end=23
Group(0) = Cecco has wrote (15:12)
Group(1) = Cecco
Group(2) = 15
Group(3) = 12
找到优秀的正则表达式语法
答案 2 :(得分:0)
好吧,以防你不知道,Matcher
有一个很好的函数可以绘制出特定的组,或者由(),Matcher.group(int)
包围的模式的一部分。就像我想匹配两个分号之间的数字,如:
<强>:22:强>
我可以使用正则表达式":(\\d+):"
匹配两个分号之间的一个或多个数字,然后我可以使用以下内容专门获取数字:
Matcher.group(1)
然后只需要将String解析为int。请注意,组编号从 1 开始。 group(0)是整个匹配,因此上一个示例的Matcher.group(0)将返回:22:
对于您的情况,我认为您需要考虑的正则表达式位是
"[A-Za-z]"
表示字母字符(您可以安全地使用"\\w"
,它会匹配字母字符,以及数字和_)。"\\d"
代表数字(1,2,3 ...)"+"
表示您想要一个或多个前一个字符或组。