Question

我有一个utf16编码的字符串，我想将其转换为float

例如
如果有一个像u"1342.223"这样的utf16字符串，它应该在浮点数中返回1342.223，如果它是utf8我用它来转换它使用stod函数，但是如何在utf16上编写这个工作enocoded string { {1}}

Answer 1

这没有标准功能。如果您可以在恰好使用std::wstring个宽字符的系统上使用16bit，则可以使用：

double d;
std::wistringstream(L"1342.223") >> d;

否则，您可以利用从UTF-16到ASCII/UTF-8的简单数字转换来编写快速转换功能。这不合理，但应该合理有效：

double u16stod(std::u16string const& u16s)
{
    char buf[std::numeric_limits<double>::max_digits10 + 1];

    std::transform(std::begin(u16s), std::end(u16s), buf,
        [](char16_t c){ return char(c); });

    buf[u16s.size()] = '\0'; // terminator

    // some error checking here?
    return std::strtod(buf, NULL);
}

Answer 2

首先，将utf16 数字字符串转换为窄字符串是微不足道的。即使你不能确定窄字符集是7位字符的ASCII，C保证代码'0'到'9'应该是连续的，并且对于Unicode（0x30到0x39）也是如此。所以代码可以很简单（仅取决于<string>包含：

double u16strtod(const std::u16string& u16) {
    char *beg = new char[u16.size() + 1];
    char *str = beg;
    for (char16_t uc: u16) {
        if (uc == u' ') *str++ = ' ';     // special processing for possible . and space
        else if (uc == u'.') *str++ = '.';
        else if ((uc < u'0') || (uc > u'9')) break;  // could use better error processing
        else {
            *str++ = '0' + (uc - u'0');
        }
    }
    *str++ = '\0';
    char *end;
    double d = strtod(beg, &end);   // could use better error processing
    delete[] beg;
    return d;
}

如果窄字符集是ASCII，则更简单：

double u16strtod(const std::u16string& u16) {
    char *beg = new char[u16.size() + 1];
    char *str = beg;
    for (char16_t uc: u16) {
        if ((uc <= 0) || (uc >= 127)) break;  // can only contain ASCII characters
        else {
            *str++ = uc;      // and the unicode code IS the ASCII code
        }
    }
    *str++ = '\0';
    char *end;
    double d = strtod(beg, &end);
    delete[] beg;
    return d;
}

Answer 3

如果你知道你的字符串格式很好（例如没有空格），那么当且仅当性能是关键时（即如果你正在分析数百万或数十亿数字），不要忽视自己解码它的可能性，循环遍历字符串。查找标准库源代码（可能比较libc ++和libstdc ++）以查看它们的作用并进行调整。当然，在这些情况下，您还应该注意并行化您的工作，尝试利用SIMD等。

将u16string转换为float

3 个答案: