如何在iPhone应用程序中解析HTML子标签?

时间:2012-06-12 12:43:18

标签: iphone html ios tags html-parser

我有HTML网页,包含大量图片和实时内容。我需要parse the data from the webpage(HTML)并在iPhone应用中显示。我使用以下代码来解析HTML内容。但我不知道如何解析标签中的子标签?

{    
    NSURL *url = [NSURL URLWithString:@"http://www.samplewebpage.com/vd/t/1/830.html"]; 

    NSData *data = [[NSData alloc] initWithContentsOfURL:url];
    NSString *responseString = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
    NSLog(@"Response : %@", responseString);

    NSMutableArray *imageURLArray = [[NSMutableArray alloc] init];
    NSMutableArray *divClassArray = [[NSMutableArray alloc] init];
    //NSString *regexStr = @"<A HREF=\"([^>]*)\">";

    // For image : 1. img src=\"([^>]*)\"  2. <img src=\"([^>]*)\">
    // For getting Class div class
    NSString *regularExpressString = @"div class=\"([^>]*)\"";
    NSError *error;
    NSInteger i =0;
    while (i<[responseString length]) 
    {
        NSRegularExpression *testRegularExpress = [NSRegularExpression regularExpressionWithPattern:regularExpressString options:NSRegularExpressionCaseInsensitive error:&error];

        if( testRegularExpress == nil ) 
        {
            NSLog( @"Error making regex: %@", error );
        }

        NSTextCheckingResult *textCheckingResult = [testRegularExpress firstMatchInString:responseString options:0 range:NSMakeRange(i, [responseString length]-i)];
        NSRange range = [textCheckingResult rangeAtIndex:1];
        if (range.location == 0) 
        { 
            break;
        }
        NSString *classNameString = [responseString substringWithRange:range];
        NSLog(@"Div Class Name : %@", classNameString);

        [divClassArray addObject:classNameString];

        i= range.location;
        //NSLog(@"Range.location : %i",range.location);
        i=i+range.length;
    }

    NSLog(@"divClass Array : %@, Count : %d", divClassArray, [divClassArray count]);
}

回应:

<div class="phoneModelItems" style="width:30%;margin-right:4px;"><a href="javascript:noxLatestChart.navToLatestChart('latest');" style="font-weight:italic;">Nokia Model</a></div>

我想从 phoneModelItems 类中获取诺基亚模型文本。你能告诉我如何检索“诺基亚模型”文本吗?提前谢谢。

1 个答案:

答案 0 :(得分:0)

这是我的问题的正则表达式:

<div\sclass=\"phoneModelItems\".*?><a\shref.*?>(.*?)<\/a><\/div>

您可以在Rubular

上进行测试