我已经创建了一个像这样的字符串标记器:
stringTokenizer = CFStringTokenizerCreate(
NULL
, (CFStringRef)str
, CFRangeMake(0, [str length])
, kCFStringTokenizerUnitSentence
, userLocale);
但是如何从tokenizer中获取这些句子呢? CF字符串编程指南未提及CFStringTokenizer
或令牌(在PDF中进行了全文搜索)。
答案 0 :(得分:16)
以下是CFStringTokenizer用法的示例:
CFStringRef string; // Get string from somewhere
CFLocaleRef locale = CFLocaleCopyCurrent();
CFStringTokenizerRef tokenizer =
CFStringTokenizerCreate(
kCFAllocatorDefault
, string
, CFRangeMake(0, CFStringGetLength(string))
, kCFStringTokenizerUnitSentence
, locale);
CFStringTokenizerTokenType tokenType = kCFStringTokenizerTokenNone;
unsigned tokensFound = 0;
while(kCFStringTokenizerTokenNone !=
(tokenType = CFStringTokenizerAdvanceToNextToken(tokenizer))) {
CFRange tokenRange = CFStringTokenizerGetCurrentTokenRange(tokenizer);
CFStringRef tokenValue =
CFStringCreateWithSubstring(
kCFAllocatorDefault
, string
, tokenRange);
// Do something with the token
CFShow(tokenValue);
CFRelease(tokenValue);
++tokensFound;
}
// Clean up
CFRelease(tokenizer);
CFRelease(locale);
答案 1 :(得分:0)
您也可以使用:
[mutstri enumerateSubstringsInRange:NSMakeRange(0, [mutstri length])
options:NSStringEnumerationBySentences
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop){
NSLog(@"%@", substring);
}];