NSString substringWithRange returns incorrect substring

时间:2015-07-31 19:53:32

标签: objective-c cocoa

I'm working on an OS X app, running XCode 6.4 and Yosemite. Distilling the problem down to a couple lines of code, I'm using substringWithRange to extract a substring and getting a string that's 18 characters long, but I was expecting a string with 26 characters. What am I doing wrong?

//              12345678901234567890123456789
NSString *s = @"ClientÅåÄäÖöÅåÆæØø_Example #2";
NSRange range = NSMakeRange(0, 26);
NSString *result = [s substringWithRange:range];
//              12345678901234567890123456
//              ClientÅåÄäÖöÅåÆæØø

EDIT: I added an NSLog to show only the first 18 characters are output and took a screenshot, but SO says I need 10 reputation points to attach an image. Let's try this: http://imgur.com/s85Zh2F. I'm not making this up, the output of NSLog shows 18 characters (as does the window with Locals showing the contents of result).

EDIT: It gets even better. I copied the string constant from above question and pasted it back into my code in a second block. http://imgur.com/glDSlq5. It seems that even though the two strings s and s2 look identical, they are somehow not the same. How can I figure out what's wrong with the first string constant? The app needs to handle whatever unicode strings are thrown at it.

EDIT: I added some code to check for equality, check lengths, and print each character as follows:

//              12345678901234567890123456789
NSString *s = @"ClientÅåÄäÖöÅåÆæØø_Example #2";    
NSString *s2 = @"ClientÅåÄäÖöÅåÆæØø_Example #2";

NSLog(@"isEqualToString is %d", [s isEqualToString:s2]);

NSLog(@"lengths are %lu\t%lu\n", [s length], [s2 length]);
for(unsigned long n = 0; n < [s length]; n++)
    NSLog(@"%@\t%@\n",
          n < [s length] ? [NSString stringWithFormat:@"%u", [s characterAtIndex:n]] : @"",
          n < [s2 length] ? [NSString stringWithFormat:@"%u", [s2 characterAtIndex:n]] : @"");

Which gives:

isEqualToString is 0
lengths are 37  29
67  67
108 108
105 105
101 101
110 110
116 116
65  197
778 229
97  196
778 228
65  214
776 246
97  197
776 229
79  198
776 230
111 216
776 248
65  95
778 69
97  120
778 97
198 109
230 112
216 108
248 101
95  32
69  35
120 50
97  
109 
112 
108 
101 
32  
35  
50  

2 个答案:

答案 0 :(得分:1)

你得到的并不总是你看到的。很有可能你设法把一些更有趣的&#34;有趣的&#34;字符串中包含Unicode字符,例如零宽度不间断空格字符,完全不可见。

我打印出字符串的长度和characterAtIndex:i,用于字符串中的所有字符,并检查其中的真实内容。

答案 1 :(得分:1)

实际上,这真是令人难过。 NSString的范围不是指Unicode代码点。在这种情况下,Unicode字符计为两个字符。

此答案说明了如何正确执行:Berry Blue's answer