用表情符号拆分字符串

时间:2014-11-22 22:13:05

标签: c++ qt qstring emoji

我需要在所有字符的数组中拆分QString。它可以包含emoji QString:
string.split("")之后:

""
"?"
"?"
" "
"?"
"?"
" "
"?"
"?"
" "
" "
"?"
"?"
" "
"?"
"?"
" "
"?"
"?"
""

据我所知,表情符号可以占用1个字节以上,但是如何拆分我的字符串呢?感谢。

1 个答案:

答案 0 :(得分:1)

据我所知,QString不支持大于U + FFFF的unicode字符。在此测试脚本中,您可以看到它无法正确计算这些字符的字符串大小:

#include <QDebug>

int main()
{
    QList<QByteArray> list;
    list << QByteArray("a");
    list << QByteArray("ö");
    list << QByteArray("➜");
    list << QByteArray("☀");
    list << QByteArray("⚡");
    list << QByteArray("");
    list << QByteArray("");
    list << QByteArray("");

    foreach (QByteArray binary, list)
    {
        QString str = QString::fromUtf8(binary);
        qDebug() << str;
        qDebug() << "Bytes:" << binary.size();
        qDebug() << "String size:" << str.size();
        {
            QDebug debugLine = (qDebug() << "Unicode code point:");
            for (int i = 0; i < str.size(); ++i)
            {
                debugLine << str[i].unicode();
            }
        }
        qDebug() << "";
    }

    return 0;
}

输出:

"a"
Bytes: 1
String size: 1
Unicode code point: 97

"ö"
Bytes: 2
String size: 1
Unicode code point: 246

"➜"
Bytes: 3
String size: 1
Unicode code point: 10140

"☀"
Bytes: 3
String size: 1
Unicode code point: 9728

"⚡"
Bytes: 3
String size: 1
Unicode code point: 9889

""
Bytes: 4
String size: 2
Unicode code point: 55357 56832

""
Bytes: 4
String size: 2
Unicode code point: 55357 57030

""
Bytes: 4
String size: 2
Unicode code point: 55357 56373