Question

尝试使用“捕获括号”来解析我在解析（在iPhone上）的某些HTML中的URL，然后将我感兴趣的位分组。

我现在有了这个：

NSString *imageHtml;  //a string with some HTML in it

NSRegularExpression* innerRegex = [[NSRegularExpression alloc] initWithPattern:@"href=\"(.*?)\"" options:NSRegularExpressionCaseInsensitive|NSRegularExpressionDotMatchesLineSeparators error:nil];
NSTextCheckingResult* firstMatch = [innerRegex firstMatchInString:imageHtml options:0 range:NSMakeRange(0, [imageHtml length])];
[innerRegex release];

if(firstMatch != nil)
{
    newImage.detailsURL = 
    NSLog(@"found url: %@", [imageHtml substringWithRange:firstMatch.range]);
}

它唯一列出的是完全匹配（所以：href =“http://tralalala.com”而不是http://tralalala.com

如何强制它仅返回我的第一个捕获括号匹配？

Answer 1

正则表达式组通过捕获组0中的整个匹配来工作，然后正则表达式中的所有组将从索引1开始。NSTextCheckingResult将这些组存储为范围。由于您的正则表达式至少需要一个组，因此以下内容将起作用。

NSString *imageHtml = @"href=\"http://tralalala.com\"";  //a string with some HTML in it

NSRegularExpression* innerRegex = [[NSRegularExpression alloc] initWithPattern:@"href=\"(.*?)\"" options:NSRegularExpressionCaseInsensitive|NSRegularExpressionDotMatchesLineSeparators error:nil];
NSTextCheckingResult* firstMatch = [innerRegex firstMatchInString:imageHtml options:0 range:NSMakeRange(0, [imageHtml length])];
[innerRegex release];

if(firstMatch != nil)
{
    //The ranges of firstMatch will provide groups, 
    //rangeAtIndex 1 = first grouping
    NSLog(@"found url: %@", [imageHtml substringWithRange:[firstMatch rangeAtIndex:1]]);
}

Answer 2

你需要这样的模式：

(?<=href=\")(.*?)(?=\")

在iphone上使用正则表达式捕获括号

2 个答案: