获取NSString中的唯一字符

时间:2013-11-13 10:11:12

标签: ios iphone objective-c cocoa-touch nsstring

如何获取NSString中的唯一字符?

我要做的是在NSString中获取所有非法字符,以便我可以提示用户输入了哪些非法字符,因此需要将其删除。我首先定义一个NSCharacterSet合法字符,将它们与每一个合法字符分开,然后将剩下的(只有非法的字符)加入到新的NSString中。我现在正计划获得新NSString的唯一字符(作为数组,希望如此),但我无法在任何地方找到引用。

NSCharacterSet *legalCharacterSet = [NSCharacterSet
    characterSetWithCharactersInString:@"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLKMNOPQRSTUVWXYZ0123456789-()&+:;,'.# "];

NSString *illegalCharactersInTitle = [[self.titleTextField.text.noWhitespace
    componentsSeparatedByCharactersInSet:legalCharacterSet]
    componentsJoinedByString:@""];

3 个答案:

答案 0 :(得分:2)

那应该对你有所帮助。我找不到任何准备使用的功能。

NSMutableSet *uniqueCharacters = [NSMutableSet set];
NSMutableString *uniqueString = [NSMutableString string];
[illegalCharactersInTitle enumerateSubstringsInRange:NSMakeRange(0, illegalCharactersInTitle.length) options:NSStringEnumerationByComposedCharacterSequences usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
    if (![uniqueCharacters containsObject:substring]) {
        [uniqueCharacters addObject:substring];
        [uniqueString appendString:substring];
    }
}];

答案 1 :(得分:2)

尝试使用以下代码修改:

// legal set
NSCharacterSet *legalCharacterSet = [NSCharacterSet
                                         characterSetWithCharactersInString:@"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLKMNOPQRSTUVWXYZ0123456789-()&+:;,'.# "];

// test strings
NSString *myString = @"LegalStrin()";
//NSString *myString = @"francesco@gmail.com"; illegal string


NSMutableCharacterSet *stringSet = [NSCharacterSet characterSetWithCharactersInString:myString];
// inverts the set
NSCharacterSet *illegalCharacterSet = [legalCharacterSet invertedSet];

// intersection of the string set and the illegal set that modifies the mutable stringset itself
[stringSet formIntersectionWithCharacterSet:illegalCharacterSet];

// prints out the illegal characters with the convenience method
NSLog(@"IllegalStringSet: %@", [self stringForCharacterSet:stringSet]);

我改编了从another stackoverflow question打印的方法:

- (NSString*)stringForCharacterSet:(NSCharacterSet*)characterSet
{
    NSMutableString *toReturn = [@"" mutableCopy];
    unichar unicharBuffer[20];
    int index = 0;

    for (unichar uc = 0; uc < (0xFFFF); uc ++)
    {
        if ([characterSet characterIsMember:uc])
        {
            unicharBuffer[index] = uc;

            index ++;

            if (index == 20)
            {
                NSString * characters = [NSString stringWithCharacters:unicharBuffer length:index];
                [toReturn appendString:characters];

                index = 0;
            }
        }
    }

    if (index != 0)
    {
        NSString * characters = [NSString stringWithCharacters:unicharBuffer length:index];
        [toReturn appendString:characters];
    }
    return toReturn;
}

答案 2 :(得分:0)

首先,你必须要小心你认为的角色。在讨论Unicode所指的UTF-16代码单元时,NSString的API使用单词字符,但是单独处理代码单元不会给出用户认为的字符。例如,组合字符与前一个字符组合以产生不同的字形。此外,还有代理对,只有在配对时才有意义。

因此,您实际上需要收集包含用户认为的字符串的子字符串。

我打算编写与Grzegorz Krukowski的答案非常相似的代码。他打败了我,所以我不会,但我会补充说,由于我引用上述原因,你的代码来过滤掉合法字符。例如,如果文本包含“é”并且它被分解为“e”加上组合的尖锐重音,则您的代码将剥离“e”,留下悬挂组合的尖锐重音。我相信你的意图是将“é”视为非法。