我有一个用UTF-8编码的字符串。例如:
Thats a nice joke
我必须提取句子中的所有表情符号。表情符号可以是任何
当使用命令less text.txt
在终端中查看此句子时,它被视为:
Thats a nice joke <U+1F606><U+1F606><U+1F606> <U+1F61B>
这是表情符号的相应UTF代码。表情符号的所有代码都可以在emojitracker找到。
为了找到所有出现的情况,我使用了正则表达式模式(<U\+\w+?>)
,但它对UTF-8编码的字符串没有用。
以下是我的代码:
String s="Thats a nice joke ";
Pattern pattern = Pattern.compile("(<U\\+\\w+?>)");
Matcher matcher = pattern.matcher(s);
List<String> matchList = new ArrayList<String>();
while (matcher.find()) {
matchList.add(matcher.group());
}
for(int i=0;i<matchList.size();i++){
System.out.println(matchList.get(i));
}
此pdf表示Range: 1F300–1F5FF for Miscellaneous Symbols and Pictographs
。所以我想捕捉这个范围内的任何角色。
答案 0 :(得分:43)
使用emoji-java我写了一个删除所有表情符号的简单方法,包括fitzpatrick modifiers。需要一个外部库,但比那些怪物正则表达式更容易维护。
使用:
String input = "A string with a \uD83D\uDC66\uD83C\uDFFFfew emojis!";
String result = EmojiParser.removeAllEmojis(input);
emoji-java maven安装:
<dependency>
<groupId>com.vdurmont</groupId>
<artifactId>emoji-java</artifactId>
<version>3.1.3</version>
</dependency>
的gradle:
compile 'com.vdurmont:emoji-java:3.1.3'
编辑:之前提交的答案被拉入了emoji-java源代码。
答案 1 :(得分:27)
the pdf that you just mentioned表示范围:1F300-1F5FF用于其他符号和象形文字。所以我想说要捕捉这个范围内的任何角色。现在该怎么办?
好的,但我会注意到你问题中的表情符号超出了这个范围! : - )
这些高于0xFFFF
的事实使事情变得复杂,因为Java字符串存储了UTF-16。所以我们不能只使用一个简单的字符类。我们将有代理对。 (更多:http://www.unicode.org/faq/utf_bom.html)
UTF-16中的U + 1F300最终成为\uD83C\uDF00
对; U + 1F5FF最终为\uD83D\uDDFF
。请注意,第一个字符上升,我们至少跨越一个边界。因此,我们必须知道我们正在寻找的代理对的范围。
我没有沉浸在UTF-16内部运作的知识中,我写了一个程序来查明(最后的来源 - 如果我是你,我会仔细检查它,而不是相信我)。它告诉我,我们正在寻找\uD83C
后跟\uDF00-\uDFFF
(包括)范围内的任何内容,或\uD83D
后跟\uDC00-\uDDFF
(包括)范围内的任何内容。< / p>
因此,掌握了这些知识,理论上我们现在可以编写一种模式:
// This is wrong, keep reading
Pattern p = Pattern.compile("(?:\uD83C[\uDF00-\uDFFF])|(?:\uD83D[\uDC00-\uDDFF])");
这是两个非捕获组的交替,第一组以\uD83C
开头,另一组以\uD83D
开头。
但失败(找不到任何内容)。我很确定这是因为我们试图在不同的地方指定代理对的 half :
Pattern p = Pattern.compile("(?:\uD83C[\uDF00-\uDFFF])|(?:\uD83D[\uDC00-\uDDFF])");
// Half of a pair --------------^------^------^-----------^------^------^
我们不能像这样拆分代理对,因为某种原因,它们被称为代理对。 : - )
因此,我认为我们根本不能使用正则表达式(或者实际上,任何基于字符串的方法)。我想我们必须搜索char
数组。
char
数组包含UTF-16值,因此如果我们以艰难的方式查找数据,可以找到数据中的那些半对:
String s = new StringBuilder()
.append("Thats a nice joke ")
.appendCodePoint(0x1F606)
.appendCodePoint(0x1F606)
.appendCodePoint(0x1F606)
.append(" ")
.appendCodePoint(0x1F61B)
.toString();
char[] chars = s.toCharArray();
int index;
char ch1;
char ch2;
index = 0;
while (index < chars.length - 1) { // -1 because we're looking for two-char-long things
ch1 = chars[index];
if ((int)ch1 == 0xD83C) {
ch2 = chars[index+1];
if ((int)ch2 >= 0xDF00 && (int)ch2 <= 0xDFFF) {
System.out.println("Found emoji at index " + index);
index += 2;
continue;
}
}
else if ((int)ch1 == 0xD83D) {
ch2 = chars[index+1];
if ((int)ch2 >= 0xDC00 && (int)ch2 <= 0xDDFF) {
System.out.println("Found emoji at index " + index);
index += 2;
continue;
}
}
++index;
}
显然这只是调试级别的代码,但它确实起到了作用。 (在你的给定字符串中,使用它的表情符号,当然它不会找到任何东西,因为它们超出范围。但是如果你将第二对的上限更改为0xDEFF
而不是0xDDFF
但是,不知道是否也包括非表情符号。)
我的计划的来源,以找出代理范围是什么:
public class FindRanges {
public static void main(String[] args) {
char last0 = '\0';
char last1 = '\0';
for (int x = 0x1F300; x <= 0x1F5FF; ++x) {
char[] chars = new StringBuilder().appendCodePoint(x).toString().toCharArray();
if (chars[0] != last0) {
if (last0 != '\0') {
System.out.println("-\\u" + Integer.toHexString((int)last1).toUpperCase());
}
System.out.print("\\u" + Integer.toHexString((int)chars[0]).toUpperCase() + " \\u" + Integer.toHexString((int)chars[1]).toUpperCase());
last0 = chars[0];
}
last1 = chars[1];
}
if (last0 != '\0') {
System.out.println("-\\u" + Integer.toHexString((int)last1).toUpperCase());
}
}
}
输出:
\uD83C \uDF00-\uDFFF \uD83D \uDC00-\uDDFF
答案 2 :(得分:16)
有类似的问题。以下为我提供了很好的服务,并与代理人配对
public class SplitByUnicode {
public static void main(String[] argv) throws Exception {
String string = "Thats a nice joke ";
System.out.println("Original String:"+string);
String regexPattern = "[\uD83C-\uDBFF\uDC00-\uDFFF]+";
byte[] utf8 = string.getBytes("UTF-8");
String string1 = new String(utf8, "UTF-8");
Pattern pattern = Pattern.compile(regexPattern);
Matcher matcher = pattern.matcher(string1);
List<String> matchList = new ArrayList<String>();
while (matcher.find()) {
matchList.add(matcher.group());
}
for(int i=0;i<matchList.size();i++){
System.out.println(i+":"+matchList.get(i));
}
}
}
输出是:
Original String:Thats a nice joke
0:
1:
找到正则表达式
答案 3 :(得分:10)
这在java 8中适用于我:
public static String mysqlSafe(String input) {
if (input == null) return null;
StringBuilder sb = new StringBuilder();
for (int i = 0; i < input.length(); i++) {
if (i < (input.length() - 1)) { // Emojis are two characters long in java, e.g. a rocket emoji is "\uD83D\uDE80";
if (Character.isSurrogatePair(input.charAt(i), input.charAt(i + 1))) {
i += 1; //also skip the second character of the emoji
continue;
}
}
sb.append(input.charAt(i));
}
return sb.toString();
}
答案 4 :(得分:6)
你可以这样做
String s="Thats a nice joke ";
Pattern pattern = Pattern.compile("[\ud83c\udc00-\ud83c\udfff]|[\ud83d\udc00-\ud83d\udfff]|[\u2600-\u27ff]",
Pattern.UNICODE_CASE | Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(s);
List<String> matchList = new ArrayList<String>();
while (matcher.find()) {
matchList.add(matcher.group());
}
for(int i=0;i<matchList.size();i++){
System.out.println(matchList.get(i));
}
答案 5 :(得分:6)
提取ALL表情符号的最佳正则表达是:
(?:[\u2700-\u27bf]|(?:\ud83c[\udde6-\uddff]){2}|[\ud800-\udbff][\udc00-\udfff]|[\u0023-\u0039]\ufe0f?\u20e3|\u3299|\u3297|\u303d|\u3030|\u24c2|\ud83c[\udd70-\udd71]|\ud83c[\udd7e-\udd7f]|\ud83c\udd8e|\ud83c[\udd91-\udd9a]|\ud83c[\udde6-\uddff]|[\ud83c[\ude01-\ude02]|\ud83c\ude1a|\ud83c\ude2f|[\ud83c[\ude32-\ude3a]|[\ud83c[\ude50-\ude51]|\u203c|\u2049|[\u25aa-\u25ab]|\u25b6|\u25c0|[\u25fb-\u25fe]|\u00a9|\u00ae|\u2122|\u2139|\ud83c\udc04|[\u2600-\u26FF]|\u2b05|\u2b06|\u2b07|\u2b1b|\u2b1c|\u2b50|\u2b55|\u231a|\u231b|\u2328|\u23cf|[\u23e9-\u23f3]|[\u23f8-\u23fa]|\ud83c\udccf|\u2934|\u2935|[\u2190-\u21ff])
它标识了许多其他答案未考虑的单一表情符号表情符号。有关此正则表达式如何工作的更多信息,请查看此文章。 https://medium.com/@thekevinscott/emojis-in-javascript-f693d0eb79fb#.enomgcu63
答案 6 :(得分:5)
假设您要求标准的Unicode表情符号范围(按供应商提供不同的块),您可以考虑这三个范围:
除了T.J.Crowder与我们分享的所有深思熟虑的解释之外,需要说的是从Java 7开始可以轻松匹配UTF-16编码的代理对。
看看文档:
http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
Unicode字符也可以通过使用其十六进制表示法(十六进制代码点值)直接表示为正则表达式,如construct \ x {...}中所述,例如可以指定补充字符U + 2011F as \ x {2011F},而不是代理对\ uD840 \ uDD1F的两个连续Unicode转义序列。
然而,如果你不能切换到Java 7,你可以扩展Guava提供的有价值的UnicodeEscaper。
这是一个例子的实现:
public class SimpleEscaper extends UnicodeEscaper
{
@Override
protected char[] escape(int codePoint)
{
if (0x1f000 >= codePoint && codePoint <= 0x1ffff)
{
return Integer.toHexString(codePoint).toCharArray();
}
return Character.toChars(codePoint);
}
}
答案 7 :(得分:3)
您也可以使用emoji4j库。
String emojiText = "A , and a became friends. For 's birthday party, they all had s, s, s and .";
EmojiUtils.removeAllEmojis(emojiText);//returns "A , and a became friends. For 's birthday party, they all had s, s, s and .
答案 8 :(得分:3)
有两种方法可以解决这个棘手问题。
第一个是使用第三方库,如emoji-java和emoji4j。这些都在上面提到。您可以轻松使用方法containsEmoji
或removesEmoji
等。在您自己的应用中,您需要使用这些库进行更新。
至于我,我想找到一个简单的解决方案来解决这个问题。
经过一整天的搜索,我发现了一个神奇的正则表达式:
"(?:[\uD83C\uDF00-\uD83D\uDDFF]|[\uD83E\uDD00-\uD83E\uDDFF]|[\uD83D\uDE00-\uD83D\uDE4F]|[\uD83D\uDE80-\uD83D\uDEFF]|[\u2600-\u26FF]\uFE0F?|[\u2700-\u27BF]\uFE0F?|\u24C2\uFE0F?|[\uD83C\uDDE6-\uD83C\uDDFF]{1,2}|[\uD83C\uDD70\uD83C\uDD71\uD83C\uDD7E\uD83C\uDD7F\uD83C\uDD8E\uD83C\uDD91-\uD83C\uDD9A]\uFE0F?|[\u0023\u002A\u0030-\u0039]\uFE0F?\u20E3|[\u2194-\u2199\u21A9-\u21AA]\uFE0F?|[\u2B05-\u2B07\u2B1B\u2B1C\u2B50\u2B55]\uFE0F?|[\u2934\u2935]\uFE0F?|[\u3030\u303D]\uFE0F?|[\u3297\u3299]\uFE0F?|[\uD83C\uDE01\uD83C\uDE02\uD83C\uDE1A\uD83C\uDE2F\uD83C\uDE32-\uD83C\uDE3A\uD83C\uDE50\uD83C\uDE51]\uFE0F?|[\u203C\u2049]\uFE0F?|[\u25AA\u25AB\u25B6\u25C0\u25FB-\u25FE]\uFE0F?|[\u00A9\u00AE]\uFE0F?|[\u2122\u2139]\uFE0F?|\uD83C\uDC04\uFE0F?|\uD83C\uDCCF\uFE0F?|[\u231A\u231B\u2328\u23CF\u23E9-\u23F3\u23F8-\u23FA]\uFE0F?)"
我在Java中测试过。它完美地解决了我的问题。
您可以在Github页面上查看:
https://github.com/zly394/EmojiRegex
注意:
@Eric Nakagawa提供的答案包含一些错误,无法正常操作。
答案 9 :(得分:3)
只是使用正则表达式来解决它:
s = s.replaceAll("\\p{So}+", "");
您可以在
中找到它http://www.regular-expressions.info/unicode.html
https://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#OTHER_SYMBOL
答案 10 :(得分:3)
表情符号正则表达式
def test_eval():
W = tf.constant(10)
with tf.Session():
print(W.eval()) # 10
def test_eval_Variable():
W = tf.Variable(10)
with tf.Session() as sess:
print(sess.run(W.initializer)) # None <--- 1.
print(W.eval()) # 10
def test_eval_Variable_all():
W = tf.Variable(10)
with tf.Session():
print(W.initializer.eval()) # error: object has no attribute 'eval'
print(W.eval())
一些表情符号(1627)
public static final String sEmojiRegex = "(?:[\\u2700-\\u27bf]|" +
"(?:[\\ud83c\\udde6-\\ud83c\\uddff]){2}|" +
"[\\ud800\\udc00-\\uDBFF\\uDFFF]|[\\u2600-\\u26FF])[\\ufe0e\\ufe0f]?(?:[\\u0300-\\u036f\\ufe20-\\ufe23\\u20d0-\\u20f0]|[\\ud83c\\udffb-\\ud83c\\udfff])?" +
"(?:\\u200d(?:[^\\ud800-\\udfff]|" +
"(?:[\\ud83c\\udde6-\\ud83c\\uddff]){2}|" +
"[\\ud800\\udc00-\\uDBFF\\uDFFF]|[\\u2600-\\u26FF])[\\ufe0e\\ufe0f]?(?:[\\u0300-\\u036f\\ufe20-\\ufe23\\u20d0-\\u20f0]|[\\ud83c\\udffb-\\ud83c\\udfff])?)*|" +
"[\\u0023-\\u0039]\\ufe0f?\\u20e3|\\u3299|\\u3297|\\u303d|\\u3030|\\u24c2|[\\ud83c\\udd70-\\ud83c\\udd71]|[\\ud83c\\udd7e-\\ud83c\\udd7f]|\\ud83c\\udd8e|[\\ud83c\\udd91-\\ud83c\\udd9a]|[\\ud83c\\udde6-\\ud83c\\uddff]|[\\ud83c\\ude01-\\ud83c\\ude02]|\\ud83c\\ude1a|\\ud83c\\ude2f|[\\ud83c\\ude32-\\ud83c\\ude3a]|[\\ud83c\\ude50-\\ud83c\\ude51]|\\u203c|\\u2049|[\\u25aa-\\u25ab]|\\u25b6|\\u25c0|[\\u25fb-\\u25fe]|\\u00a9|\\u00ae|\\u2122|\\u2139|\\ud83c\\udc04|[\\u2600-\\u26FF]|\\u2b05|\\u2b06|\\u2b07|\\u2b1b|\\u2b1c|\\u2b50|\\u2b55|\\u231a|\\u231b|\\u2328|\\u23cf|[\\u23e9-\\u23f3]|[\\u23f8-\\u23fa]|\\ud83c\\udccf|\\u2934|\\u2935|[\\u2190-\\u21ff]";
测试表情符号的功能
// count = 1627
public static final String sEmojiTest = "☺️☹️☠️✊✌️☝️✋✍️♀♀♀♀♀️♀️⚕⚕✈✈⚖⚖♀♂♂♂♂♀♂♀♂♂♂♂♂♂♀♀❤️❤️❤️❤️⛑☂️☘️⭐️✨⚡️☄☀️⛅️☁️⛈☃️⛄️❄️☔️☕️⚽️⚾️⛳️⛸⛷️♀️♀♂♀♂⛹️♀️⛹♀♂️♀️♀♀♀♂♀♀♀♀♂✈️⛵️⛴⚓️⛽️⛲️⛱⛰⛺️⛪️⛩⌚️⌨️☎️⏱⏲⏰⌛️⏳⚖️⚒⛏⚙️⛓⚔️⚰️⚱️⚗️✉️✂️✒️✏️❤️❣️☮️✝️☪️☸️✡️☯️☦️⛎♈️♉️♊️♋️♌️♍️♎️♏️♐️♑️♒️♓️⚛️☢️☣️️️✴️㊙️㊗️️️️❌⭕️⛔️♨️❗️❕❓❔‼️⁉️〽️⚠️⚜️♻️✅️❇️✳️❎Ⓜ️♿️️️ℹ️0️⃣1️⃣2️⃣3️⃣4️⃣5️⃣6️⃣7️⃣8️⃣9️⃣#️⃣*️⃣▶️⏸⏯⏹⏺⏭⏮⏩⏪⏫⏬◀️➡️⬅️⬆️⬇️↗️↘️↙️↖️↕️↔️↪️↩️⤴️⤵️➕➖➗✖️™️©️®️〰️➰➿✔️☑️⚪️⚫️▪️▫️◾️◽️◼️◻️⬛️⬜️♠️♣️♥️♦️️️️⚽️⚾️⛳️⛸⛷️♀️♀️♀️♀️♀️♀️️♀️♂️♀️♀️♀️♀️♀️♀️♂️♂️♂️♂️♂️♂️⛹️♀️⛹♀️⛹♀️⛹♀️⛹♀️⛹♀️⛹️⛹⛹⛹⛹⛹♀️♀️♀️♀️♀️♀️♂️♂️♂️♂️♂️♂️️♀️♀️♀️♀️♀️♀️️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♂️♂️♂️♂️♂️♂️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♂️";
Here是要点,在所有unicode 10 emojis上进行测试
感谢Kevin Scott撰写greate示例
答案 11 :(得分:2)
这是我用来删除表情符号的内容,到目前为止它已经显示允许所有其他字母表。
private static String remove_Emojis(String name)
{
//we will store all the letters in this array
ArrayList<Character> nonEmoji = new ArrayList<>();
// and when we rebuild the name we will put it in here
String newName = "";
// we are going to loop through checking each character to see if its an emoji or not
for (int i = 0; i < name.length(); i++)
{
if (Character.isLetterOrDigit(name.charAt(i)))
{
nonEmoji.add(name.charAt(i));
}
else
{
// this is just a 2nd check in case the other method didn't allow some letter
if (Build.VERSION.SDK_INT > 18)
{
if (Character.isAlphabetic(name.charAt(i)))
{
nonEmoji.add(name.charAt(i));
}
}
}
if (name.charAt(i) == ' ')// may want to consider adding or '-' or '\''
{
nonEmoji.add(i);// just add it
}
if (name.charAt(i) == '@' && !name.contains(" "))// I put this in for email addresses
{
nonEmoji.add('@');
}
}
// finally just loop through building it back out
for (int i = 0; i < nonEmoji.size(); i++) {
newName += nonEmoji.get(i);
}
return newName;
}
答案 12 :(得分:1)
这是一个更简单的方法和一个正则表达式,可以正确解析(当前日期为 2021 年 5 月)所有 3,521 个表情符号。
这是一种以编程方式构建的简单交替工作,首先匹配最长的表情符号,从而避免了大量建议模式出现的问题,即复合表情符号中的部分匹配。 (例如:??❤️??? - 因为这是几个用零宽度连接器(U+200D
)粘合在一起的表情符号,所以你需要匹配更长的序列而不对组件进行部分匹配)
为了让模式足够短,可以粘贴在这里,我们大胆地使用了文字表情符号,但 unicode 转义也同样有效(有关演示和源代码,请参见底部的链接):
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class MyClass {
public static void main(String args[]) {
String line = "Adds ? word-relevant ? emojis ? ❤ to ? text ✨ with ? sometimes ?? hilarious ? ? results ?. Read more ?? about??❤️??? ma?tching compound emojis";
String pattern = "(?:??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|??❤️???|???????|???????|???????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|?????|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|??❤️??|?❤️??|?❤️??|?❤️??|????|????|????|????|????|????|????|????|????|???|?❤️?|?❤️?|?❤️?|???|???|???|???|???|???|???|???|???|???|???|???|?️?️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??⚕️|??⚕️|??⚕️|??⚕️|??⚕️|??⚕️|??⚕️|??⚕️|??⚕️|??⚕️|??⚕️|??⚕️|??⚕️|??⚕️|??⚕️|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|??⚖️|??⚖️|??⚖️|??⚖️|??⚖️|??⚖️|??⚖️|??⚖️|??⚖️|??⚖️|??⚖️|??⚖️|??⚖️|??⚖️|??⚖️|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|??✈️|??✈️|??✈️|??✈️|??✈️|??✈️|??✈️|??✈️|??✈️|??✈️|??✈️|??✈️|??✈️|??✈️|??✈️|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|???|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??♂️|??♂️|??♂️|??♂️|??♂️|??♀️|??♀️|??♀️|??♀️|??♀️|??️|?️♂️|?️♀️|?️♂️|?️♀️|?️♂️|?️♀️|?️?|?️⚧️|⛹?♂️|⛹?♂️|⛹?♂️|⛹?♂️|⛹?♂️|⛹?♀️|⛹?♀️|⛹?♀️|⛹?♀️|⛹?♀️|??|??|❤️?|❤️?|?♂️|?♀️|??|??|??|??|??|??|??|??|??|??|??|??|?♀️|?♂️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?⚕️|?⚕️|?⚕️|??|??|??|??|??|??|?⚖️|?⚖️|?⚖️|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|?✈️|?✈️|?✈️|??|??|??|??|??|??|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|??|??|??|??|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|??|??|??|??|??|??|??|??|??|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|⛹️♂️|⛹️♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|?♂️|?♀️|??|??|??|??|??|?❄️|?☠️|?⬛|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|??|#️⃣|0️⃣|1️⃣|2️⃣|3️⃣|4️⃣|5️⃣|6️⃣|7️⃣|8️⃣|9️⃣|✋?|✋?|✋?|✋?|✋?|✌?|✌?|✌?|✌?|✌?|☝?|☝?|☝?|☝?|☝?|✊?|✊?|✊?|✊?|✊?|✍?|✍?|✍?|✍?|✍?|⛹?|⛹?|⛹?|⛹?|⛹?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|?|☺|☹|☠|❣|❤|✋|✌|☝|✊|✍|⛷|⛹|☘|☕|⛰|⛪|⛩|⛲|⛺|♨|⛽|⚓|⛵|⛴|✈|⌛|⏳|⌚|⏰|⏱|⏲|☀|⭐|☁|⛅|⛈|☂|☔|⛱|⚡|❄|☃|⛄|☄|✨|⚽|⚾|⛳|⛸|♠|♥|♦|♣|♟|⛑|☎|⌨|✉|✏|✒|✂|⛏|⚒|⚔|⚙|⚖|⛓|⚗|⚰|⚱|♿|⚠|⛔|☢|☣|⬆|↗|➡|↘|⬇|↙|⬅|↖|↕|↔|↩|↪|⤴|⤵|⚛|✡|☸|☯|✝|☦|☪|☮|♈|♉|♊|♋|♌|♍|♎|♏|♐|♑|♒|♓|⛎|▶|⏩|⏭|⏯|◀|⏪|⏮|⏫|⏬|⏸|⏹|⏺|⏏|♀|♂|⚧|✖|➕|➖|➗|♾|‼|⁉|❓|❔|❕|❗|〰|⚕|♻|⚜|⭕|✅|☑|✔|❌|❎|➰|➿|〽|✳|✴|❇|©|®|™|ℹ|Ⓜ|㊗|㊙|⚫|⚪|⬛|⬜|◼|◻|◾|◽|▪|▫)";
var i = 0;
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(line);
while(m.find( )) {
i++;
System.out.println("Found value: " + m.group(0) );
}
System.out.println("Found " + i + " emojis." );
}
}
更多信息:
https://github.com/sweaver2112/Regex-combined-emojis
答案 13 :(得分:0)
只要规格发生变化,您就可以生成自己的正则表达式 此工具(屏幕截图here)。
对于utf-8/32模式(stringed),扩展模式:
" # Use the 'Mega-Conversion' tool to change into other syntaxes"
" # -------------------------------------------------------------"
" "
" [#*0-9] \\x{FE0F} \\x{20E3}"
" | [\\x{A9}\\x{AE}\\x{203C}\\x{2049}\\x{2122}\\x{2139}\\x{2194}-\\x{2199}\\x{21A9}\\x{21AA}\\x{231A}\\x{231B}\\x{2328}\\x{23CF}\\x{23E9}-\\x{23F3}\\x{23F8}-\\x{23FA}\\x{24C2}\\x{25AA}\\x{25AB}\\x{25B6}\\x{25C0}\\x{25FB}-\\x{25FE}\\x{2600}-\\x{2604}\\x{260E}\\x{2611}\\x{2614}\\x{2615}\\x{2618}]"
" | \\x{261D} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{2620}\\x{2622}\\x{2623}\\x{2626}\\x{262A}\\x{262E}\\x{262F}\\x{2638}-\\x{263A}\\x{2640}\\x{2642}\\x{2648}-\\x{2653}\\x{265F}\\x{2660}\\x{2663}\\x{2665}\\x{2666}\\x{2668}\\x{267B}\\x{267E}\\x{267F}\\x{2692}-\\x{2697}\\x{2699}\\x{269B}\\x{269C}\\x{26A0}\\x{26A1}\\x{26AA}\\x{26AB}\\x{26B0}\\x{26B1}\\x{26BD}\\x{26BE}\\x{26C4}\\x{26C5}\\x{26C8}\\x{26CE}\\x{26CF}\\x{26D1}\\x{26D3}\\x{26D4}\\x{26E9}\\x{26EA}\\x{26F0}-\\x{26F5}\\x{26F7}\\x{26F8}]"
" | \\x{26F9}"
" (?:"
" \\x{FE0F} \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{26FA}\\x{26FD}\\x{2702}\\x{2705}\\x{2708}\\x{2709}]"
" | [\\x{270A}-\\x{270D}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{270F}\\x{2712}\\x{2714}\\x{2716}\\x{271D}\\x{2721}\\x{2728}\\x{2733}\\x{2734}\\x{2744}\\x{2747}\\x{274C}\\x{274E}\\x{2753}-\\x{2755}\\x{2757}\\x{2763}\\x{2764}\\x{2795}-\\x{2797}\\x{27A1}\\x{27B0}\\x{27BF}\\x{2934}\\x{2935}\\x{2B05}-\\x{2B07}\\x{2B1B}\\x{2B1C}\\x{2B50}\\x{2B55}\\x{3030}\\x{303D}\\x{3297}\\x{3299}\\x{1F004}\\x{1F0CF}\\x{1F170}\\x{1F171}\\x{1F17E}\\x{1F17F}\\x{1F18E}\\x{1F191}-\\x{1F19A}]"
" | \\x{1F1E6} [\\x{1F1E8}-\\x{1F1EC}\\x{1F1EE}\\x{1F1F1}\\x{1F1F2}\\x{1F1F4}\\x{1F1F6}-\\x{1F1FA}\\x{1F1FC}\\x{1F1FD}\\x{1F1FF}]"
" | \\x{1F1E7} [\\x{1F1E6}\\x{1F1E7}\\x{1F1E9}-\\x{1F1EF}\\x{1F1F1}-\\x{1F1F4}\\x{1F1F6}-\\x{1F1F9}\\x{1F1FB}\\x{1F1FC}\\x{1F1FE}\\x{1F1FF}]"
" | \\x{1F1E8} [\\x{1F1E6}\\x{1F1E8}\\x{1F1E9}\\x{1F1EB}-\\x{1F1EE}\\x{1F1F0}-\\x{1F1F5}\\x{1F1F7}\\x{1F1FA}-\\x{1F1FF}]"
" | \\x{1F1E9} [\\x{1F1EA}\\x{1F1EC}\\x{1F1EF}\\x{1F1F0}\\x{1F1F2}\\x{1F1F4}\\x{1F1FF}]"
" | \\x{1F1EA} [\\x{1F1E6}\\x{1F1E8}\\x{1F1EA}\\x{1F1EC}\\x{1F1ED}\\x{1F1F7}-\\x{1F1FA}]"
" | \\x{1F1EB} [\\x{1F1EE}-\\x{1F1F0}\\x{1F1F2}\\x{1F1F4}\\x{1F1F7}]"
" | \\x{1F1EC} [\\x{1F1E6}\\x{1F1E7}\\x{1F1E9}-\\x{1F1EE}\\x{1F1F1}-\\x{1F1F3}\\x{1F1F5}-\\x{1F1FA}\\x{1F1FC}\\x{1F1FE}]"
" | \\x{1F1ED} [\\x{1F1F0}\\x{1F1F2}\\x{1F1F3}\\x{1F1F7}\\x{1F1F9}\\x{1F1FA}]"
" | \\x{1F1EE} [\\x{1F1E8}-\\x{1F1EA}\\x{1F1F1}-\\x{1F1F4}\\x{1F1F6}-\\x{1F1F9}]"
" | \\x{1F1EF} [\\x{1F1EA}\\x{1F1F2}\\x{1F1F4}\\x{1F1F5}]"
" | \\x{1F1F0} [\\x{1F1EA}\\x{1F1EC}-\\x{1F1EE}\\x{1F1F2}\\x{1F1F3}\\x{1F1F5}\\x{1F1F7}\\x{1F1FC}\\x{1F1FE}\\x{1F1FF}]"
" | \\x{1F1F1} [\\x{1F1E6}-\\x{1F1E8}\\x{1F1EE}\\x{1F1F0}\\x{1F1F7}-\\x{1F1FB}\\x{1F1FE}]"
" | \\x{1F1F2} [\\x{1F1E6}\\x{1F1E8}-\\x{1F1ED}\\x{1F1F0}-\\x{1F1FF}]"
" | \\x{1F1F3} [\\x{1F1E6}\\x{1F1E8}\\x{1F1EA}-\\x{1F1EC}\\x{1F1EE}\\x{1F1F1}\\x{1F1F4}\\x{1F1F5}\\x{1F1F7}\\x{1F1FA}\\x{1F1FF}]"
" | \\x{1F1F4} \\x{1F1F2}"
" | \\x{1F1F5} [\\x{1F1E6}\\x{1F1EA}-\\x{1F1ED}\\x{1F1F0}-\\x{1F1F3}\\x{1F1F7}-\\x{1F1F9}\\x{1F1FC}\\x{1F1FE}]"
" | \\x{1F1F6} \\x{1F1E6}"
" | \\x{1F1F7} [\\x{1F1EA}\\x{1F1F4}\\x{1F1F8}\\x{1F1FA}\\x{1F1FC}]"
" | \\x{1F1F8} [\\x{1F1E6}-\\x{1F1EA}\\x{1F1EC}-\\x{1F1F4}\\x{1F1F7}-\\x{1F1F9}\\x{1F1FB}\\x{1F1FD}-\\x{1F1FF}]"
" | \\x{1F1F9} [\\x{1F1E6}\\x{1F1E8}\\x{1F1E9}\\x{1F1EB}-\\x{1F1ED}\\x{1F1EF}-\\x{1F1F4}\\x{1F1F7}\\x{1F1F9}\\x{1F1FB}\\x{1F1FC}\\x{1F1FF}]"
" | \\x{1F1FA} [\\x{1F1E6}\\x{1F1EC}\\x{1F1F2}\\x{1F1F3}\\x{1F1F8}\\x{1F1FE}\\x{1F1FF}]"
" | \\x{1F1FB} [\\x{1F1E6}\\x{1F1E8}\\x{1F1EA}\\x{1F1EC}\\x{1F1EE}\\x{1F1F3}\\x{1F1FA}]"
" | \\x{1F1FC} [\\x{1F1EB}\\x{1F1F8}]"
" | \\x{1F1FD} \\x{1F1F0}"
" | \\x{1F1FE} [\\x{1F1EA}\\x{1F1F9}]"
" | \\x{1F1FF} [\\x{1F1E6}\\x{1F1F2}\\x{1F1FC}]"
" | [\\x{1F201}\\x{1F202}\\x{1F21A}\\x{1F22F}\\x{1F232}-\\x{1F23A}\\x{1F250}\\x{1F251}\\x{1F300}-\\x{1F321}\\x{1F324}-\\x{1F384}]"
" | \\x{1F385} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F386}-\\x{1F393}\\x{1F396}\\x{1F397}\\x{1F399}-\\x{1F39B}\\x{1F39E}-\\x{1F3C1}]"
" | \\x{1F3C2} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F3C3}\\x{1F3C4}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F3C5}\\x{1F3C6}]"
" | \\x{1F3C7} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F3C8}\\x{1F3C9}]"
" | \\x{1F3CA}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F3CB}\\x{1F3CC}]"
" (?:"
" \\x{FE0F} \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F3CD}-\\x{1F3F0}]"
" | \\x{1F3F3}"
" (?: \\x{FE0F} \\x{200D} \\x{1F308} )?"
" | \\x{1F3F4}"
" (?:"
" \\x{200D} \\x{2620} \\x{FE0F}"
" | \\x{E0067} \\x{E0062}"
" (?:"
" \\x{E0065} \\x{E006E} \\x{E0067}"
" | \\x{E0073} \\x{E0063} \\x{E0074}"
" | \\x{E0077} \\x{E006C} \\x{E0073}"
" )"
" \\x{E007F}"
" )?"
" | [\\x{1F3F5}\\x{1F3F7}-\\x{1F440}]"
" | \\x{1F441}"
" (?: \\x{FE0F} \\x{200D} \\x{1F5E8} \\x{FE0F} )?"
" | [\\x{1F442}\\x{1F443}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F444}\\x{1F445}]"
" | [\\x{1F446}-\\x{1F450}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F451}-\\x{1F465}]"
" | [\\x{1F466}\\x{1F467}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F468}"
" (?:"
" \\x{200D}"
" (?:"
" [\\x{2695}\\x{2696}\\x{2708}] \\x{FE0F}"
" | \\x{2764} \\x{FE0F} \\x{200D}"
" (?: \\x{1F48B} \\x{200D} )?"
" \\x{1F468}"
" | [\\x{1F33E}\\x{1F373}\\x{1F393}\\x{1F3A4}\\x{1F3A8}\\x{1F3EB}\\x{1F3ED}]"
" | \\x{1F466}"
" (?: \\x{200D} \\x{1F466} )?"
" | \\x{1F467}"
" (?: \\x{200D} [\\x{1F466}\\x{1F467}] )?"
" | [\\x{1F468}\\x{1F469}] \\x{200D}"
" (?:"
" \\x{1F466}"
" (?: \\x{200D} \\x{1F466} )?"
" | \\x{1F467}"
" (?: \\x{200D} [\\x{1F466}\\x{1F467}] )?"
" )"
" | [\\x{1F4BB}\\x{1F4BC}\\x{1F527}\\x{1F52C}\\x{1F680}\\x{1F692}\\x{1F9B0}-\\x{1F9B3}]"
" )"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?:"
" \\x{200D}"
" (?:"
" [\\x{2695}\\x{2696}\\x{2708}] \\x{FE0F}"
" | [\\x{1F33E}\\x{1F373}\\x{1F393}\\x{1F3A4}\\x{1F3A8}\\x{1F3EB}\\x{1F3ED}\\x{1F4BB}\\x{1F4BC}\\x{1F527}\\x{1F52C}\\x{1F680}\\x{1F692}\\x{1F9B0}-\\x{1F9B3}]"
" )"
" )?"
" )?"
" | \\x{1F469}"
" (?:"
" \\x{200D}"
" (?:"
" [\\x{2695}\\x{2696}\\x{2708}] \\x{FE0F}"
" | \\x{2764} \\x{FE0F} \\x{200D}"
" (?: \\x{1F48B} \\x{200D} )?"
" [\\x{1F468}\\x{1F469}]"
" | [\\x{1F33E}\\x{1F373}\\x{1F393}\\x{1F3A4}\\x{1F3A8}\\x{1F3EB}\\x{1F3ED}]"
" | \\x{1F466}"
" (?: \\x{200D} \\x{1F466} )?"
" | \\x{1F467}"
" (?: \\x{200D} [\\x{1F466}\\x{1F467}] )?"
" | \\x{1F469} \\x{200D}"
" (?:"
" \\x{1F466}"
" (?: \\x{200D} \\x{1F466} )?"
" | \\x{1F467}"
" (?: \\x{200D} [\\x{1F466}\\x{1F467}] )?"
" )"
" | [\\x{1F4BB}\\x{1F4BC}\\x{1F527}\\x{1F52C}\\x{1F680}\\x{1F692}\\x{1F9B0}-\\x{1F9B3}]"
" )"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?:"
" \\x{200D}"
" (?:"
" [\\x{2695}\\x{2696}\\x{2708}] \\x{FE0F}"
" | [\\x{1F33E}\\x{1F373}\\x{1F393}\\x{1F3A4}\\x{1F3A8}\\x{1F3EB}\\x{1F3ED}\\x{1F4BB}\\x{1F4BC}\\x{1F527}\\x{1F52C}\\x{1F680}\\x{1F692}\\x{1F9B0}-\\x{1F9B3}]"
" )"
" )?"
" )?"
" | [\\x{1F46A}-\\x{1F46D}]"
" | \\x{1F46E}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | \\x{1F46F}"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" | \\x{1F470} [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F471}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | \\x{1F472} [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F473}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F474}-\\x{1F476}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F477}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | \\x{1F478} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F479}-\\x{1F47B}]"
" | \\x{1F47C} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F47D}-\\x{1F480}]"
" | [\\x{1F481}\\x{1F482}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | \\x{1F483} [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F484}"
" | \\x{1F485} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F486}\\x{1F487}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F488}-\\x{1F4A9}]"
" | \\x{1F4AA} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F4AB}-\\x{1F4FD}\\x{1F4FF}-\\x{1F53D}\\x{1F549}-\\x{1F54E}\\x{1F550}-\\x{1F567}\\x{1F56F}\\x{1F570}\\x{1F573}]"
" | \\x{1F574} [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F575}"
" (?:"
" \\x{FE0F} \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F576}-\\x{1F579}]"
" | \\x{1F57A} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F587}\\x{1F58A}-\\x{1F58D}]"
" | [\\x{1F590}\\x{1F595}\\x{1F596}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F5A4}\\x{1F5A5}\\x{1F5A8}\\x{1F5B1}\\x{1F5B2}\\x{1F5BC}\\x{1F5C2}-\\x{1F5C4}\\x{1F5D1}-\\x{1F5D3}\\x{1F5DC}-\\x{1F5DE}\\x{1F5E1}\\x{1F5E3}\\x{1F5E8}\\x{1F5EF}\\x{1F5F3}\\x{1F5FA}-\\x{1F644}]"
" | [\\x{1F645}-\\x{1F647}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F648}-\\x{1F64A}]"
" | \\x{1F64B}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | \\x{1F64C} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F64D}\\x{1F64E}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | \\x{1F64F} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F680}-\\x{1F6A2}]"
" | \\x{1F6A3}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F6A4}-\\x{1F6B3}]"
" | [\\x{1F6B4}-\\x{1F6B6}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F6B7}-\\x{1F6BF}]"
" | \\x{1F6C0} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F6C1}-\\x{1F6C5}\\x{1F6CB}]"
" | \\x{1F6CC} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F6CD}-\\x{1F6D2}\\x{1F6E0}-\\x{1F6E5}\\x{1F6E9}\\x{1F6EB}\\x{1F6EC}\\x{1F6F0}\\x{1F6F3}-\\x{1F6F9}\\x{1F910}-\\x{1F917}]"
" | [\\x{1F918}-\\x{1F91C}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F91D}"
" | [\\x{1F91E}\\x{1F91F}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F920}-\\x{1F925}]"
" | \\x{1F926}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F927}-\\x{1F92F}]"
" | [\\x{1F930}-\\x{1F936}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F937}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F938}\\x{1F939}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | \\x{1F93A}"
" | \\x{1F93C}"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" | [\\x{1F93D}\\x{1F93E}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F940}-\\x{1F945}\\x{1F947}-\\x{1F970}\\x{1F973}-\\x{1F976}\\x{1F97A}\\x{1F97C}-\\x{1F9A2}\\x{1F9B0}-\\x{1F9B4}]"
" | [\\x{1F9B5}\\x{1F9B6}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F9B7}"
" | [\\x{1F9B8}\\x{1F9B9}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F9C0}-\\x{1F9C2}\\x{1F9D0}]"
" | [\\x{1F9D1}-\\x{1F9D5}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F9D6}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F9D7}-\\x{1F9DD}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F9DE}\\x{1F9DF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" | [\\x{1F9E0}-\\x{1F9FF}]"
对于utf-16模式(stringed),压缩模式:
"[#*0-9]\\uFE0F\\u20E3|[\\u00A9\\u00AE\\u203C\\u2049\\u2122\\u2139\\u2"
"194-\\u2199\\u21A9\\u21AA\\u231A\\u231B\\u2328\\u23CF\\u23E9-\\u23F3\\"
"u23F8-\\u23FA\\u24C2\\u25AA\\u25AB\\u25B6\\u25C0\\u25FB-\\u25FE\\u260"
"0-\\u2604\\u260E\\u2611\\u2614\\u2615\\u2618]|\\u261D(?:\\uD83C[\\uDF"
"FB-\\uDFFF])?|[\\u2620\\u2622\\u2623\\u2626\\u262A\\u262E\\u262F\\u26"
"38-\\u263A\\u2640\\u2642\\u2648-\\u2653\\u265F\\u2660\\u2663\\u2665\\u"
"2666\\u2668\\u267B\\u267E\\u267F\\u2692-\\u2697\\u2699\\u269B\\u269C\\"
"u26A0\\u26A1\\u26AA\\u26AB\\u26B0\\u26B1\\u26BD\\u26BE\\u26C4\\u26C5\\"
"u26C8\\u26CE\\u26CF\\u26D1\\u26D3\\u26D4\\u26E9\\u26EA\\u26F0-\\u26F5"
"\\u26F7\\u26F8]|\\u26F9(?:\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640"
"\\u2642]\\uFE0F)?|\\uFE0F\\u200D[\\u2640\\u2642]\\uFE0F)?|[\\u26FA\\u"
"26FD\\u2702\\u2705\\u2708\\u2709]|[\\u270A-\\u270D](?:\\uD83C[\\uDFF"
"B-\\uDFFF])?|[\\u270F\\u2712\\u2714\\u2716\\u271D\\u2721\\u2728\\u273"
"3\\u2734\\u2744\\u2747\\u274C\\u274E\\u2753-\\u2755\\u2757\\u2763\\u27"
"64\\u2795-\\u2797\\u27A1\\u27B0\\u27BF\\u2934\\u2935\\u2B05-\\u2B07\\u"
"2B1B\\u2B1C\\u2B50\\u2B55\\u3030\\u303D\\u3297\\u3299]|\\uD83C(?:[\\u"
"DC04\\uDCCF\\uDD70\\uDD71\\uDD7E\\uDD7F\\uDD8E\\uDD91-\\uDD9A]|\\uDDE"
"6\\uD83C[\\uDDE8-\\uDDEC\\uDDEE\\uDDF1\\uDDF2\\uDDF4\\uDDF6-\\uDDFA\\u"
"DDFC\\uDDFD\\uDDFF]|\\uDDE7\\uD83C[\\uDDE6\\uDDE7\\uDDE9-\\uDDEF\\uDD"
"F1-\\uDDF4\\uDDF6-\\uDDF9\\uDDFB\\uDDFC\\uDDFE\\uDDFF]|\\uDDE8\\uD83C"
"[\\uDDE6\\uDDE8\\uDDE9\\uDDEB-\\uDDEE\\uDDF0-\\uDDF5\\uDDF7\\uDDFA-\\u"
"DDFF]|\\uDDE9\\uD83C[\\uDDEA\\uDDEC\\uDDEF\\uDDF0\\uDDF2\\uDDF4\\uDDF"
"F]|\\uDDEA\\uD83C[\\uDDE6\\uDDE8\\uDDEA\\uDDEC\\uDDED\\uDDF7-\\uDDFA]"
"|\\uDDEB\\uD83C[\\uDDEE-\\uDDF0\\uDDF2\\uDDF4\\uDDF7]|\\uDDEC\\uD83C["
"\\uDDE6\\uDDE7\\uDDE9-\\uDDEE\\uDDF1-\\uDDF3\\uDDF5-\\uDDFA\\uDDFC\\uD"
"DFE]|\\uDDED\\uD83C[\\uDDF0\\uDDF2\\uDDF3\\uDDF7\\uDDF9\\uDDFA]|\\uDD"
"EE\\uD83C[\\uDDE8-\\uDDEA\\uDDF1-\\uDDF4\\uDDF6-\\uDDF9]|\\uDDEF\\uD8"
"3C[\\uDDEA\\uDDF2\\uDDF4\\uDDF5]|\\uDDF0\\uD83C[\\uDDEA\\uDDEC-\\uDDE"
"E\\uDDF2\\uDDF3\\uDDF5\\uDDF7\\uDDFC\\uDDFE\\uDDFF]|\\uDDF1\\uD83C[\\u"
"DDE6-\\uDDE8\\uDDEE\\uDDF0\\uDDF7-\\uDDFB\\uDDFE]|\\uDDF2\\uD83C[\\uD"
"DE6\\uDDE8-\\uDDED\\uDDF0-\\uDDFF]|\\uDDF3\\uD83C[\\uDDE6\\uDDE8\\uDD"
"EA-\\uDDEC\\uDDEE\\uDDF1\\uDDF4\\uDDF5\\uDDF7\\uDDFA\\uDDFF]|\\uDDF4\\"
"uD83C\\uDDF2|\\uDDF5\\uD83C[\\uDDE6\\uDDEA-\\uDDED\\uDDF0-\\uDDF3\\uD"
"DF7-\\uDDF9\\uDDFC\\uDDFE]|\\uDDF6\\uD83C\\uDDE6|\\uDDF7\\uD83C[\\uDD"
"EA\\uDDF4\\uDDF8\\uDDFA\\uDDFC]|\\uDDF8\\uD83C[\\uDDE6-\\uDDEA\\uDDEC"
"-\\uDDF4\\uDDF7-\\uDDF9\\uDDFB\\uDDFD-\\uDDFF]|\\uDDF9\\uD83C[\\uDDE6"
"\\uDDE8\\uDDE9\\uDDEB-\\uDDED\\uDDEF-\\uDDF4\\uDDF7\\uDDF9\\uDDFB\\uDD"
"FC\\uDDFF]|\\uDDFA\\uD83C[\\uDDE6\\uDDEC\\uDDF2\\uDDF3\\uDDF8\\uDDFE\\"
"uDDFF]|\\uDDFB\\uD83C[\\uDDE6\\uDDE8\\uDDEA\\uDDEC\\uDDEE\\uDDF3\\uDD"
"FA]|\\uDDFC\\uD83C[\\uDDEB\\uDDF8]|\\uDDFD\\uD83C\\uDDF0|\\uDDFE\\uD8"
"3C[\\uDDEA\\uDDF9]|\\uDDFF\\uD83C[\\uDDE6\\uDDF2\\uDDFC]|[\\uDE01\\uD"
"E02\\uDE1A\\uDE2F\\uDE32-\\uDE3A\\uDE50\\uDE51\\uDF00-\\uDF21\\uDF24-"
"\\uDF84]|\\uDF85(?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDF86-\\uDF93\\uDF9"
"6\\uDF97\\uDF99-\\uDF9B\\uDF9E-\\uDFC1]|\\uDFC2(?:\\uD83C[\\uDFFB-\\u"
"DFFF])?|[\\uDFC3\\uDFC4](?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\"
"uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDFC5\\uDFC6"
"]|\\uDFC7(?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDFC8\\uDFC9]|\\uDFCA(?:\\"
"u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2"
"640\\u2642]\\uFE0F)?)?|[\\uDFCB\\uDFCC](?:\\uD83C[\\uDFFB-\\uDFFF]("
"?:\\u200D[\\u2640\\u2642]\\uFE0F)?|\\uFE0F\\u200D[\\u2640\\u2642]\\uF"
"E0F)?|[\\uDFCD-\\uDFF0]|\\uDFF3(?:\\uFE0F\\u200D\\uD83C\\uDF08)?|\\u"
"DFF4(?:\\u200D\\u2620\\uFE0F|\\uDB40\\uDC67\\uDB40\\uDC62\\uDB40(?:\\"
"uDC65\\uDB40\\uDC6E\\uDB40\\uDC67|\\uDC73\\uDB40\\uDC63\\uDB40\\uDC74"
"|\\uDC77\\uDB40\\uDC6C\\uDB40\\uDC73)\\uDB40\\uDC7F)?|[\\uDFF5\\uDFF7"
"-\\uDFFF])|\\uD83D(?:[\\uDC00-\\uDC40]|\\uDC41(?:\\uFE0F\\u200D\\uD8"
"3D\\uDDE8\\uFE0F)?|[\\uDC42\\uDC43](?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\"
"uDC44\\uDC45]|[\\uDC46-\\uDC50](?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDC"
"51-\\uDC65]|[\\uDC66\\uDC67](?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDC68(?"
":\\u200D(?:[\\u2695\\u2696\\u2708]\\uFE0F|\\u2764\\uFE0F\\u200D\\uD83"
"D(?:\\uDC8B\\u200D\\uD83D)?\\uDC68|\\uD83C[\\uDF3E\\uDF73\\uDF93\\uDF"
"A4\\uDFA8\\uDFEB\\uDFED]|\\uD83D(?:\\uDC66(?:\\u200D\\uD83D\\uDC66)?"
"|\\uDC67(?:\\u200D\\uD83D[\\uDC66\\uDC67])?|[\\uDC68\\uDC69]\\u200D\\"
"uD83D(?:\\uDC66(?:\\u200D\\uD83D\\uDC66)?|\\uDC67(?:\\u200D\\uD83D["
"\\uDC66\\uDC67])?)|[\\uDCBB\\uDCBC\\uDD27\\uDD2C\\uDE80\\uDE92])|\\uD"
"83E[\\uDDB0-\\uDDB3])|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D(?:[\\u2695"
"\\u2696\\u2708]\\uFE0F|\\uD83C[\\uDF3E\\uDF73\\uDF93\\uDFA4\\uDFA8\\uD"
"FEB\\uDFED]|\\uD83D[\\uDCBB\\uDCBC\\uDD27\\uDD2C\\uDE80\\uDE92]|\\uD8"
"3E[\\uDDB0-\\uDDB3]))?)?|\\uDC69(?:\\u200D(?:[\\u2695\\u2696\\u2708"
"]\\uFE0F|\\u2764\\uFE0F\\u200D\\uD83D(?:\\uDC8B\\u200D\\uD83D)?[\\uDC"
"68\\uDC69]|\\uD83C[\\uDF3E\\uDF73\\uDF93\\uDFA4\\uDFA8\\uDFEB\\uDFED]"
"|\\uD83D(?:\\uDC66(?:\\u200D\\uD83D\\uDC66)?|\\uDC67(?:\\u200D\\uD83"
"D[\\uDC66\\uDC67])?|\\uDC69\\u200D\\uD83D(?:\\uDC66(?:\\u200D\\uD83D"
"\\uDC66)?|\\uDC67(?:\\u200D\\uD83D[\\uDC66\\uDC67])?)|[\\uDCBB\\uDCB"
"C\\uDD27\\uDD2C\\uDE80\\uDE92])|\\uD83E[\\uDDB0-\\uDDB3])|\\uD83C[\\u"
"DFFB-\\uDFFF](?:\\u200D(?:[\\u2695\\u2696\\u2708]\\uFE0F|\\uD83C[\\u"
"DF3E\\uDF73\\uDF93\\uDFA4\\uDFA8\\uDFEB\\uDFED]|\\uD83D[\\uDCBB\\uDCB"
"C\\uDD27\\uDD2C\\uDE80\\uDE92]|\\uD83E[\\uDDB0-\\uDDB3]))?)?|[\\uDC6"
"A-\\uDC6D]|\\uDC6E(?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-"
"\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|\\uDC6F(?:\\u200D[\\u2"
"640\\u2642]\\uFE0F)?|\\uDC70(?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDC71(?"
":\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\"
"u2640\\u2642]\\uFE0F)?)?|\\uDC72(?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDC"
"73(?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u20"
"0D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDC74-\\uDC76](?:\\uD83C[\\uDFFB-\\"
"uDFFF])?|\\uDC77(?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\"
"uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|\\uDC78(?:\\uD83C[\\uDF"
"FB-\\uDFFF])?|[\\uDC79-\\uDC7B]|\\uDC7C(?:\\uD83C[\\uDFFB-\\uDFFF])"
"?|[\\uDC7D-\\uDC80]|[\\uDC81\\uDC82](?:\\u200D[\\u2640\\u2642]\\uFE0"
"F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|\\uD"
"C83(?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDC84|\\uDC85(?:\\uD83C[\\uDFFB-"
"\\uDFFF])?|[\\uDC86\\uDC87](?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C"
"[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDC88-\\uD"
"CA9]|\\uDCAA(?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDCAB-\\uDCFD\\uDCFF-\\"
"uDD3D\\uDD49-\\uDD4E\\uDD50-\\uDD67\\uDD6F\\uDD70\\uDD73]|\\uDD74(?:"
"\\uD83C[\\uDFFB-\\uDFFF])?|\\uDD75(?:\\uD83C[\\uDFFB-\\uDFFF](?:\\u2"
"00D[\\u2640\\u2642]\\uFE0F)?|\\uFE0F\\u200D[\\u2640\\u2642]\\uFE0F)?"
"|[\\uDD76-\\uDD79]|\\uDD7A(?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDD87\\uD"
"D8A-\\uDD8D]|[\\uDD90\\uDD95\\uDD96](?:\\uD83C[\\uDFFB-\\uDFFF])?|["
"\\uDDA4\\uDDA5\\uDDA8\\uDDB1\\uDDB2\\uDDBC\\uDDC2-\\uDDC4\\uDDD1-\\uDD"
"D3\\uDDDC-\\uDDDE\\uDDE1\\uDDE3\\uDDE8\\uDDEF\\uDDF3\\uDDFA-\\uDE44]|"
"[\\uDE45-\\uDE47](?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\"
"uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDE48-\\uDE4A]|\\uDE"
"4B(?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u20"
"0D[\\u2640\\u2642]\\uFE0F)?)?|\\uDE4C(?:\\uD83C[\\uDFFB-\\uDFFF])?|"
"[\\uDE4D\\uDE4E](?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\u"
"DFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|\\uDE4F(?:\\uD83C[\\uDFF"
"B-\\uDFFF])?|[\\uDE80-\\uDEA2]|\\uDEA3(?:\\u200D[\\u2640\\u2642]\\uF"
"E0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|["
"\\uDEA4-\\uDEB3]|[\\uDEB4-\\uDEB6](?:\\u200D[\\u2640\\u2642]\\uFE0F|"
"\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDE"
"B7-\\uDEBF]|\\uDEC0(?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDEC1-\\uDEC5\\u"
"DECB]|\\uDECC(?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDECD-\\uDED2\\uDEE0-"
"\\uDEE5\\uDEE9\\uDEEB\\uDEEC\\uDEF0\\uDEF3-\\uDEF9])|\\uD83E(?:[\\uDD"
"10-\\uDD17]|[\\uDD18-\\uDD1C](?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDD1D|"
"[\\uDD1E\\uDD1F](?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDD20-\\uDD25]|\\uD"
"D26(?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u2"
"00D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDD27-\\uDD2F]|[\\uDD30-\\uDD36]("
"?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDD37(?:\\u200D[\\u2640\\u2642]\\uFE0"
"F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\u"
"DD38\\uDD39](?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFF"
"F](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|\\uDD3A|\\uDD3C(?:\\u200D[\\"
"u2640\\u2642]\\uFE0F)?|[\\uDD3D\\uDD3E](?:\\u200D[\\u2640\\u2642]\\u"
"FE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|"
"[\\uDD40-\\uDD45\\uDD47-\\uDD70\\uDD73-\\uDD76\\uDD7A\\uDD7C-\\uDDA2\\"
"uDDB0-\\uDDB4]|[\\uDDB5\\uDDB6](?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDDB"
"7|[\\uDDB8\\uDDB9](?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-"
"\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDDC0-\\uDDC2\\uDDD"
"0]|[\\uDDD1-\\uDDD5](?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDDD6(?:\\u200D"
"[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u"
"2642]\\uFE0F)?)?|[\\uDDD7-\\uDDDD](?:\\u200D[\\u2640\\u2642]\\uFE0F"
"|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\uD"
"DDE\\uDDDF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?|[\\uDDE0-\\uDDFF])"
答案 14 :(得分:0)
正则表达式太慢,表情符号更新非常快。
尝试这个项目simple-emoji-4j
与Emoji 12.0(2018.10.15)兼容
简单于:
EmojiUtils.containsEmoji(str)
答案 15 :(得分:0)
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.5.0/css/bootstrap.min.css">
<script src="https://cdn.jsdelivr.net/npm/popper.js@1.16.0/dist/umd/popper.min.js"></script>
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.5.0/js/bootstrap.min.js"></script>
非常适合将表情符号与PCRE正则表达式匹配。在https://regex101.com/r/o69vJJ/1上进行测试。
答案 16 :(得分:0)
某些 PCE 不承认 \p
。许多不允许超过 2 个字节 \udde6-\ud83c
的字符范围。
我想出的一个有效技巧是对它们进行编码,以便强制对字符进行转义,例如 json。
编码为 json 后,字符现在是文字 \ud000
,可以使用标准正则表达式找到它:\\\\ud[0-9a-f]{3}
, \\\\u[0-9a-f]{4,6}
过滤转义的字符串后,数据可以在没有表情符号的情况下再次解码。