找出字符串中的字符是否是表情符号?

时间:2015-06-10 13:03:02

标签: ios string swift character emoji

我需要找出字符串中的字符是否是表情符号。

例如,我有这个角色:

let string = ""
let character = Array(string)[0]

我需要知道这个角色是不是表情符号。

14 个答案:

答案 0 :(得分:138)

我偶然发现的是字符,unicode标量和字形之间的区别。

例如,字形由7个unicode标量组成:

  • 四个表情符号字符:
  • 在每个表情符号之间是一个特殊字符,它的作用就像字符胶水;见the specs for more info

另一个例子,字形由2个unicode标量组成:

  • 常规表情符号:
  • 肤色修饰符:

因此,在渲染字符时,生成的字形确实很重要。

我正在寻找的是一种检测字符串是否只有一个表情符号的方法。所以我可以渲染它比普通文本更大(就像iOS10和WhatsApp现在做的消息一样)。如上所述,字符数实际上没用。 ('胶水字符'也不被视为表情符号)。

你可以做的是使用CoreText来帮助你将字符串分解为字形并计算它们。此外,我会将Arnold和Sebastian Lopez提出的部分延期移至UnicodeScalar的单独扩展。

它会留下以下结果:

import Foundation

extension UnicodeScalar {
    var isEmoji: Bool {
        switch value {
        case 0x1F600...0x1F64F, // Emoticons
             0x1F300...0x1F5FF, // Misc Symbols and Pictographs
             0x1F680...0x1F6FF, // Transport and Map
             0x1F1E6...0x1F1FF, // Regional country flags
             0x2600...0x26FF, // Misc symbols
             0x2700...0x27BF, // Dingbats
             0xE0020...0xE007F, // Tags
             0xFE00...0xFE0F, // Variation Selectors
             0x1F900...0x1F9FF, // Supplemental Symbols and Pictographs
             0x1F018...0x1F270, // Various asian characters
             0x238C...0x2454, // Misc items
             0x20D0...0x20FF: // Combining Diacritical Marks for Symbols
            return true

        default: return false
        }
    }

    var isZeroWidthJoiner: Bool {
        return value == 8205
    }
}

extension String {
    // Not needed anymore in swift 4.2 and later, using `.count` will give you the correct result
    var glyphCount: Int {
        let richText = NSAttributedString(string: self)
        let line = CTLineCreateWithAttributedString(richText)
        return CTLineGetGlyphCount(line)
    }

    var isSingleEmoji: Bool {
        return glyphCount == 1 && containsEmoji
    }

    var containsEmoji: Bool {
        return unicodeScalars.contains { $0.isEmoji }
    }

    var containsOnlyEmoji: Bool {
        return !isEmpty
            && !unicodeScalars.contains(where: {
                !$0.isEmoji && !$0.isZeroWidthJoiner
            })
    }

    // The next tricks are mostly to demonstrate how tricky it can be to determine emoji's
    // If anyone has suggestions how to improve this, please let me know
    var emojiString: String {
        return emojiScalars.map { String($0) }.reduce("", +)
    }

    var emojis: [String] {
        var scalars: [[UnicodeScalar]] = []
        var currentScalarSet: [UnicodeScalar] = []
        var previousScalar: UnicodeScalar?

        for scalar in emojiScalars {
            if let prev = previousScalar, !prev.isZeroWidthJoiner, !scalar.isZeroWidthJoiner {
                scalars.append(currentScalarSet)
                currentScalarSet = []
            }
            currentScalarSet.append(scalar)

            previousScalar = scalar
        }

        scalars.append(currentScalarSet)

        return scalars.map { $0.map { String($0) }.reduce("", +) }
    }

    fileprivate var emojiScalars: [UnicodeScalar] {
        var chars: [UnicodeScalar] = []
        var previous: UnicodeScalar?
        for cur in unicodeScalars {
            if let previous = previous, previous.isZeroWidthJoiner, cur.isEmoji {
                chars.append(previous)
                chars.append(cur)

            } else if cur.isEmoji {
                chars.append(cur)
            }

            previous = cur
        }

        return chars
    }
}

这会给你以下结果:

"".isSingleEmoji // true
"‍♂️".isSingleEmoji // true
"‍‍‍".isSingleEmoji // true
"‍‍‍".containsOnlyEmoji // true
"Hello ‍‍‍".containsOnlyEmoji // false
"Hello ‍‍‍".containsEmoji // true
" Héllo ‍‍‍".emojiString // "‍‍‍"
"‍‍‍".glyphCount // 1
"‍‍‍".characters.count // 4, Will return '1' in Swift 4.2 so previous method not needed anymore
"‍‍‍".count // 4, Will return '1' in Swift 4.2 so previous method not needed anymore

" Héllœ ‍‍‍".emojiScalars // [128107, 128104, 8205, 128105, 8205, 128103, 8205, 128103]
" Héllœ ‍‍‍".emojis // ["", "‍‍‍"]

"‍‍‍‍‍".isSingleEmoji // false
"‍‍‍‍‍".containsOnlyEmoji // true
"‍‍‍‍‍".glyphCount // 3
"‍‍‍‍‍".characters.count // 8, Will return '3' in Swift 4.2 so previous method not needed anymore

答案 1 :(得分:41)

最简单,最干净,最简单的方法是简单地检查字符串中每个字符的Unicode代码点与已知的表情符号和dingbats范围,如下所示:

extension String {

    var containsEmoji: Bool {
        for scalar in unicodeScalars {
            switch scalar.value {
            case 0x1F600...0x1F64F, // Emoticons
                 0x1F300...0x1F5FF, // Misc Symbols and Pictographs
                 0x1F680...0x1F6FF, // Transport and Map
                 0x2600...0x26FF,   // Misc symbols
                 0x2700...0x27BF,   // Dingbats
                 0xFE00...0xFE0F,   // Variation Selectors
                 0x1F900...0x1F9FF, // Supplemental Symbols and Pictographs
                 0x1F1E6...0x1F1FF: // Flags
                return true
            default:
                continue
            }
        }
        return false
    }

}

答案 2 :(得分:8)

extension String {
    func containsEmoji() -> Bool {
        for scalar in unicodeScalars {
            switch scalar.value {
            case 0x3030, 0x00AE, 0x00A9,// Special Characters
            0x1D000...0x1F77F,          // Emoticons
            0x2100...0x27BF,            // Misc symbols and Dingbats
            0xFE00...0xFE0F,            // Variation Selectors
            0x1F900...0x1F9FF:          // Supplemental Symbols and Pictographs
                return true
            default:
                continue
            }
        }
        return false
    }
}

这是我的修正,更新范围。

答案 3 :(得分:4)

Swift 3注意:

cnui_containsEmojiCharacters方法似乎已删除或移动到其他动态库。 _containsEmoji应该仍然可以使用。

let str: NSString = "hello"

@objc protocol NSStringPrivate {
    func _containsEmoji() -> ObjCBool
}

let strPrivate = unsafeBitCast(str, to: NSStringPrivate.self)
strPrivate._containsEmoji() // true
str.value(forKey: "_containsEmoji") // 1


let swiftStr = "hello"
(swiftStr as AnyObject).value(forKey: "_containsEmoji") // 1

Swift 2.x:

我最近在NSString上发现了一个私有API,它公开了检测字符串是否包含表情符号字符的功能:

let str: NSString = "hello"

使用objc协议和unsafeBitCast

@objc protocol NSStringPrivate {
    func cnui_containsEmojiCharacters() -> ObjCBool
    func _containsEmoji() -> ObjCBool
}

let strPrivate = unsafeBitCast(str, NSStringPrivate.self)
strPrivate.cnui_containsEmojiCharacters() // true
strPrivate._containsEmoji() // true

使用valueForKey

str.valueForKey("cnui_containsEmojiCharacters") // 1
str.valueForKey("_containsEmoji") // 1

使用纯Swift字符串,您必须在使用AnyObject之前将字符串转换为valueForKey

let str = "hello"

(str as AnyObject).valueForKey("cnui_containsEmojiCharacters") // 1
(str as AnyObject).valueForKey("_containsEmoji") // 1

NSString header file中的方法。

答案 4 :(得分:4)

Swift 5.0

…引入了一种新的检查方法!

您必须将String分解成Scalars。每个Scalar都有一个Property值,该值支持isEmoji值!

实际上,您甚至可以检查Scalar是否是Emoji修改器或更多。请查看Apple的文档:https://developer.apple.com/documentation/swift/unicode/scalar/properties

您可能要考虑检查isEmojiPresentation而不是isEmoji,因为Apple为isEmoji声明以下内容:

  

此属性对于默认情况下渲染为表情符号的标量以及在后跟U + FE0F VARIATION SELECTOR-16时具有非默认表情符号渲染的标量均适用。其中包括一些通常不被视为表情符号的标量。


这种方法实际上将表情符号拆分为所有修饰符,但处理起来更简单。随着Swift现在将带有修饰符(例如?‍?‍?‍?,??‍?,?)的Emoji计数为1,您可以做各种事情。

var string = "? test"

for scalar in string.unicodeScalars {
    let isEmoji = scalar.properties.isEmoji

    print("\(scalar.description) \(isEmoji)"))
}

// ? true
//   false
// t false
// e false
// s false
// t false

NSHipster指出了一种获取所有表情符号的有趣方法:

import Foundation

var emoji = CharacterSet()

for codePoint in 0x0000...0x1F0000 {
    guard let scalarValue = Unicode.Scalar(codePoint) else {
        continue
    }

    // Implemented in Swift 5 (SE-0221)
    // https://github.com/apple/swift-evolution/blob/master/proposals/0221-character-properties.md
    if scalarValue.properties.isEmoji {
        emoji.insert(scalarValue)
    }
}

答案 5 :(得分:3)

使用Swift 5,您现在可以检查字符串中每个字符的unicode属性。这为我们在每个字母上提供了方便的isEmoji变量。问题是isEmoji对于任何可以转换为2字节表情符号的字符都将返回true,例如0-9。

我们可以查看变量isEmoji,还检查是否存在表情符号修饰符,以确定模糊字符是否将显示为表情符号。

与此处提供的regex解决方案相比,该解决方案应具有更广阔的前景。

extension String {
    func containsOnlyEmojis() -> Bool {
        if count == 0 {
            return false
        }
        for character in self {
            if !character.isEmoji {
                return false
            }
        }
        return true
    }

    func containsEmoji() -> Bool {
        for character in self {
            if character.isEmoji {
                return true
            }
        }
        return false
    }
}

extension Character {
    // An emoji can either be a 2 byte unicode character or a normal UTF8 character with an emoji modifier
    // appended as is the case with 3️⃣. 0x238C is the first instance of UTF16 emoji that requires no modifier.
    // `isEmoji` will evaluate to true for any character that can be turned into an emoji by adding a modifier
    // such as the digit "3". To avoid this we confirm that any character below 0x238C has an emoji modifier attached
    var isEmoji: Bool {
        guard let scalar = unicodeScalars.first else { return false }
        return scalar.properties.isEmoji && (scalar.value > 0x238C || unicodeScalars.count > 1)
    }
}

给我们

"hey".containsEmoji() //false

"Hello World ?".containsEmoji() //true
"Hello World ?".containsOnlyEmojis() //false

"?".containsEmoji() //true
"?".containsOnlyEmojis() //true

答案 6 :(得分:2)

您可以使用此代码example或此pod

要在Swift中使用它,请将类别导入YourProject_Bridging_Header

#import "NSString+EMOEmoji.h"

然后你可以检查字符串中每个表情符号的范围:

let example: NSString = "string‍‍‍withemojis✊" //string with emojis

let containsEmoji: Bool = example.emo_containsEmoji()

    print(containsEmoji)

// Output: ["true"]

I created an small example project with the code above.

答案 7 :(得分:2)

对于Swift 3.0.2,以下答案是最简单的答案:

class func stringContainsEmoji (string : NSString) -> Bool
{
    var returnValue: Bool = false

    string.enumerateSubstrings(in: NSMakeRange(0, (string as NSString).length), options: NSString.EnumerationOptions.byComposedCharacterSequences) { (substring, substringRange, enclosingRange, stop) -> () in

        let objCString:NSString = NSString(string:substring!)
        let hs: unichar = objCString.character(at: 0)
        if 0xd800 <= hs && hs <= 0xdbff
        {
            if objCString.length > 1
            {
                let ls: unichar = objCString.character(at: 1)
                let step1: Int = Int((hs - 0xd800) * 0x400)
                let step2: Int = Int(ls - 0xdc00)
                let uc: Int = Int(step1 + step2 + 0x10000)

                if 0x1d000 <= uc && uc <= 0x1f77f
                {
                    returnValue = true
                }
            }
        }
        else if objCString.length > 1
        {
            let ls: unichar = objCString.character(at: 1)
            if ls == 0x20e3
            {
                returnValue = true
            }
        }
        else
        {
            if 0x2100 <= hs && hs <= 0x27ff
            {
                returnValue = true
            }
            else if 0x2b05 <= hs && hs <= 0x2b07
            {
                returnValue = true
            }
            else if 0x2934 <= hs && hs <= 0x2935
            {
                returnValue = true
            }
            else if 0x3297 <= hs && hs <= 0x3299
            {
                returnValue = true
            }
            else if hs == 0xa9 || hs == 0xae || hs == 0x303d || hs == 0x3030 || hs == 0x2b55 || hs == 0x2b1c || hs == 0x2b1b || hs == 0x2b50
            {
                returnValue = true
            }
        }
    }

    return returnValue;
}

答案 8 :(得分:2)

未来证明:手动检查角色的像素;当添加新的表情符号时,其他解决方案将会中断(并且已经破坏)。

注意:这是Objective-C(可以转换为Swift)

多年来,这些表情符号检测解决方案不断突破,因为Apple添加了新的表情符号(如通过预先诅咒带有附加角色的角色制作的肤色表情符号)等等。

我终于崩溃了,只是编写了以下方法,该方法适用于所有当前的表情符号,并且应该适用于所有未来的表情符号。

该解决方案创建了一个带有角色和黑色背景的UILabel。然后,CG拍摄标签的快照,并扫描快照中的所有像素,以查找任何非纯黑像素。我添加黑色背景的原因是为了避免由Subpixel Rendering

引起的假色问题

解决方案在我的设备上运行非常快,我可以每秒检查数百个字符,但应该注意这是一个CoreGraphics解决方案,不应像使用常规文本方法那样大量使用。图形处理数据量很大,因此一次检查数千个字符可能会导致明显的延迟。

-(BOOL)isEmoji:(NSString *)character {

    UILabel *characterRender = [[UILabel alloc] initWithFrame:CGRectMake(0, 0, 1, 1)];
    characterRender.text = character;
    characterRender.font = [UIFont fontWithName:@"AppleColorEmoji" size:12.0f];//Note: Size 12 font is likely not crucial for this and the detector will probably still work at an even smaller font size, so if you needed to speed this checker up for serious performance you may test lowering this to a font size like 6.0
    characterRender.backgroundColor = [UIColor blackColor];//needed to remove subpixel rendering colors
    [characterRender sizeToFit];

    CGRect rect = [characterRender bounds];
    UIGraphicsBeginImageContextWithOptions(rect.size,YES,0.0f);
    CGContextRef contextSnap = UIGraphicsGetCurrentContext();
    [characterRender.layer renderInContext:contextSnap];
    UIImage *capturedImage = UIGraphicsGetImageFromCurrentImageContext();
    UIGraphicsEndImageContext();

    CGImageRef imageRef = [capturedImage CGImage];
    NSUInteger width = CGImageGetWidth(imageRef);
    NSUInteger height = CGImageGetHeight(imageRef);
    CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
    unsigned char *rawData = (unsigned char*) calloc(height * width * 4, sizeof(unsigned char));
    NSUInteger bytesPerPixel = 4;//Note: Alpha Channel not really needed, if you need to speed this up for serious performance you can refactor this pixel scanner to just RGB
    NSUInteger bytesPerRow = bytesPerPixel * width;
    NSUInteger bitsPerComponent = 8;
    CGContextRef context = CGBitmapContextCreate(rawData, width, height,
                                                 bitsPerComponent, bytesPerRow, colorSpace,
                                                 kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big);
    CGColorSpaceRelease(colorSpace);

    CGContextDrawImage(context, CGRectMake(0, 0, width, height), imageRef);
    CGContextRelease(context);

    BOOL colorPixelFound = NO;

    int x = 0;
    int y = 0;
    while (y < height && !colorPixelFound) {
        while (x < width && !colorPixelFound) {

            NSUInteger byteIndex = (bytesPerRow * y) + x * bytesPerPixel;

            CGFloat red = (CGFloat)rawData[byteIndex];
            CGFloat green = (CGFloat)rawData[byteIndex+1];
            CGFloat blue = (CGFloat)rawData[byteIndex+2];

            CGFloat h, s, b, a;
            UIColor *c = [UIColor colorWithRed:red green:green blue:blue alpha:1.0f];
            [c getHue:&h saturation:&s brightness:&b alpha:&a];//Note: I wrote this method years ago, can't remember why I check HSB instead of just checking r,g,b==0; Upon further review this step might not be needed, but I haven't tested to confirm yet. 

            b /= 255.0f;

            if (b > 0) {
                colorPixelFound = YES;
            }

            x++;
        }
        x=0;
        y++;
    }

    return colorPixelFound;

}

答案 9 :(得分:2)

与我之前写的那些完全相似的答案,但是有更新的表情符号标量。

extension String {
    func isContainEmoji() -> Bool {
        let isContain = unicodeScalars.first(where: { $0.isEmoji }) != nil
        return isContain
    }
}


extension UnicodeScalar {

    var isEmoji: Bool {
        switch value {
        case 0x1F600...0x1F64F,
             0x1F300...0x1F5FF,
             0x1F680...0x1F6FF,
             0x1F1E6...0x1F1FF,
             0x2600...0x26FF,
             0x2700...0x27BF,
             0xFE00...0xFE0F,
             0x1F900...0x1F9FF,
             65024...65039,
             8400...8447,
             9100...9300,
             127000...127600:
            return true
        default:
            return false
        }
    }

}

答案 10 :(得分:1)

您可以像这样使用NSString-RemoveEmoji

if string.isIncludingEmoji {

}

答案 11 :(得分:0)

使用 unicode 标量属性 isEmoji

extension String {
    var containsEmoji: Bool {
        for scalar in unicodeScalars {
            if scalar.properties.isEmoji {
                return true
            }
        }
        return false
    }
}

如何使用

let str = "?"
print(str.containsEmoji) // true

答案 12 :(得分:0)

原生一行代码

"❤️".unicodeScalars.contains { $0.properties.isEmoji } // true

来自 Swift 5.0 的作品

答案 13 :(得分:-1)

我遇到了同样的问题,并最终制作了StringCharacter个扩展程序。

代码太长而无法发布,因为它实际列出了CharacterSet中的所有表情符号(来自官方unicode列表v5.0),您可以在此处找到它:

https://github.com/piterwilson/StringEmoji

常量

让emojiCharacterSet:CharacterSet

包含所有已知表情符号的字符集(如官方Unicode列表5.0 http://unicode.org/emoji/charts-5.0/emoji-list.html中所述)

字符串

var isEmoji:Bool {get}

String实例是否代表已知的单个表情符号

print("".isEmoji) // false
print("".isEmoji) // true
print("".isEmoji) // false (String is not a single Emoji)
var containsEmoji:Bool {get}

String实例是否包含已知的表情符号

print("".containsEmoji) // false
print("".containsEmoji) // true
print("".containsEmoji) // true
var unicodeName:String {get}

在字符串

的副本上应用kCFStringTransformToUnicodeName - CFStringTransform
print("á".unicodeName) // \N{LATIN SMALL LETTER A WITH ACUTE}
print("".unicodeName) // "\N{FACE WITH STUCK-OUT TONGUE AND WINKING EYE}"
var niceUnicodeName:String {get}

返回kCFStringTransformToUnicodeName - CFStringTransform的结果,其中\N{前缀和}后缀已删除

print("á".unicodeName) // LATIN SMALL LETTER A WITH ACUTE
print("".unicodeName) // FACE WITH STUCK-OUT TONGUE AND WINKING EYE

字符

var isEmoji:Bool {get}

Character实例是否代表已知的表情符号

print("".isEmoji) // false
print("".isEmoji) // true