Xcode中。将HTML代码中包含字符的字符串转换为Unicode字符串

时间:2011-06-23 08:22:47

标签: html xcode unicode encoding

这是我遇到的问题。 iPhone上的程序应与服务器同步,服务器也与网络版服务同步。 Web服务以HTML代码(& #amp;,& #quote;,№等)将字符串中的特殊字符发送到服务器。我需要显示这些数据,这就是为什么我需要将这些符号转换成某些东西,xcode可以解码和绘制。 正如我发现的那样,HTML结尾的代码是相同的,差异仅在格式上(如HTML中的№是Unicode中的\ u8470)。我试过在字符串中更改此格式并将其编码为UTF8。结果,现在我有一个功能:

+(NSString *) replaceHTMLCodes:(NSString *)text{
NSLog(@"replacing HTML codes");
if (text){
    NSLog(@"%@", text);
    NSString *tmpString=[NSString stringWithString:text];
    tmpString = [text copy];
    NSString *tmpText = @"";
    int locAmp = [tmpString rangeOfString:@"&#"].location;
    NSString * Code = @"";
    int locComa;
    while (locAmp!=NSNotFound) {
        tmpText = [tmpText stringByAppendingString:[tmpString substringToIndex:locAmp]];
        tmpString = [tmpString stringByReplacingCharactersInRange:NSMakeRange(0, locAmp) withString:@""];
        locComa = [tmpString rangeOfString:@";"].location;
        Code = [NSString stringWithString:[tmpString substringWithRange:NSMakeRange(0, locComa)]];
        Code = [Code stringByReplacingOccurrencesOfString:@"&#" withString:@"\\u"];
        NSLog(@"%@", Code);
        tmpString = [tmpString stringByReplacingCharactersInRange:NSMakeRange(0, locComa+1) withString:@""];
        tmpText = [tmpText stringByAppendingFormat:@"%C", Code];
        locAmp = [tmpString rangeOfString:@"&#"].location;
    }
    tmpText = [tmpText stringByAppendingString:tmpString];
    NSLog(@"%@", tmpText);
    return tmpText;
}
else
    return text;
}

但它不正确 - 它显示随机的Unicode符号,而不是我想要的。我曾尝试使用NSUTF8StringEncoding,但它也没有帮助。

任何想法如何解决这个问题?转换代码我是对的吗?

3 个答案:

答案 0 :(得分:2)

谢谢你,戴夫。你的答案很有用。最后,这是我的惯例。希望,这对某些人有用。

+(NSString *) replaceHTMLCodes:(NSString *)text{
if (text){
    NSString *tmpString=[NSString stringWithString:text];
    tmpString = [text copy];
    NSString *tmpText = @"";
    int locAmp = [tmpString rangeOfString:@"&"].location;
    NSString * Code = @"";
    int locComa;
    while (locAmp!=NSNotFound && locAmp!=-1) {
        tmpText = [tmpText stringByAppendingString:[tmpString substringToIndex:locAmp]];
        tmpString = [tmpString stringByReplacingCharactersInRange:NSMakeRange(0, locAmp) withString:@""];
        locComa = [tmpString rangeOfString:@";"].location;
        Code = [NSString stringWithString:[tmpString substringWithRange:NSMakeRange(0, locComa)]];
        Code = [Code stringByReplacingOccurrencesOfString:@"&" withString:@""];
        if ([Code characterAtIndex:0]=='#') {
            Code = [Code stringByReplacingOccurrencesOfString:@"#" withString:@""];
            tmpText = [tmpText stringByAppendingFormat:@"%C", [Code intValue]];
        } else {
            if ([Code compare:@"amp"]==NSOrderedSame) {
                tmpText = [tmpText stringByAppendingString:@"&"];
            } else if ([Code compare:@"quot"]==NSOrderedSame) {
                tmpText = [tmpText stringByAppendingString:@"\""];   
            } else if ([Code compare:@"gt"]==NSOrderedSame) {
                tmpText = [tmpText stringByAppendingString:@">"];
            } else if ([Code compare:@"lt"]==NSOrderedSame) {
                tmpText = [tmpText stringByAppendingString:@"<"];
            } else if ([Code compare:@"laquo"]==NSOrderedSame) {
                tmpText = [tmpText stringByAppendingString:@"«"];
            } else if ([Code compare:@"raquo"]==NSOrderedSame) {
                tmpText = [tmpText stringByAppendingString:@"»"];
            }
        }
        tmpString = [tmpString stringByReplacingCharactersInRange:NSMakeRange(0, locComa+1) withString:@""];
        locAmp = [tmpString rangeOfString:@"&"].location;
    }
    tmpText = [tmpText  stringByAppendingString:tmpString];
    return tmpText;
}
else
    return text;
}

也许,这不是理想的,但它对我有用。

答案 1 :(得分:0)

你的日常生活中有一两个小虫子。这个版本似乎有用......

NSString * replaceHTMLCodes(NSString *text)
{

    if (text){
        NSString *tmpString=[NSString stringWithString:text];
        tmpString = [text copy];
        NSString *tmpText = @"";
        int locAmp = [tmpString rangeOfString:@"&#"].location;
        NSString * Code = @"";
        int locComa;
        while (locAmp!=NSNotFound && locAmp != -1) {
            tmpText = [tmpText stringByAppendingString:[tmpString substringToIndex:locAmp]];
            tmpString = [tmpString stringByReplacingCharactersInRange:NSMakeRange(0, locAmp) withString:@""];
            locComa = [tmpString rangeOfString:@";"].location;
            Code = [NSString stringWithString:[tmpString substringWithRange:NSMakeRange(0, locComa)]];
            Code = [Code stringByReplacingOccurrencesOfString:@"&#" withString:@""];

            tmpString = [tmpString stringByReplacingCharactersInRange:NSMakeRange(0, locComa+1) withString:@""];
            tmpText = [tmpText stringByAppendingFormat:@"%C", [Code intValue]];

            locAmp = [tmpString rangeOfString:@"&#"].location;
        }
        tmpText = [tmpText stringByAppendingString:tmpString];

        return tmpText;
    }
    else
        return text;
}

答案 2 :(得分:0)

这是我的版本。可以找到可用代码列表here 我为NSString创建了类别:

@interface NSString (HTMLDecode)

- (NSString *)htmlfDecodedString;

@end

@implementation NSString (HTMLDecode)

- (NSString *)htmlfDecodedString{

    NSDictionary *codesToSymbols = @{@"&quot;" : @"\"",
                                     @"&amp;"  : @"&",
                                     @"&lt;"   : @"<",
                                     @"&gt;"   : @">",
                                     @"&euro;" : @"€",
                                     @"&laquo;" : @"«",
                                     @"&raquo;" : @"»"};

    NSMutableString *str = [self mutableCopy];

    [codesToSymbols enumerateKeysAndObjectsUsingBlock:^(NSString  *key, NSString  *value, BOOL *stop) {
        [str replaceOccurrencesOfString:key withString:value options:NSCaseInsensitiveSearch range:NSMakeRange(0, str.length)];
    }];

    return str;
}

@end

如何使用它?

就这么简单:

NSString *html =
@"<table>\
    <tbody>\
        <tr>\
            <td>Testing html symbols. Ampersand:&amp;. &laquo;Hello Double&raquo;. &lsquo;Hello single!&lsquo;\
            </td>\
        </tr>\
    </tbody>\
</table>";


NSString *result = [html htmlfDecodedString];

NSLog(@"converted html:\n%@",result);

它会产生这样的HTML:

<table><tbody><tr><td>Testing html symbols. Ampersand:&. «Hello Double». 'Hello single!'</td></tr></tbody></table>