Question

我想要一个选项，将字符串转换为具有两种不同行为的宽字符串：

忽略非法字符
如果发生非法字符，则中止转换：

在Windows XP上，我可以这样做：

bool ignore_illegal; // input

DWORD flags = ignore_illegal ? 0 : MB_ERR_INVALID_CHARS;

SetLastError(0);

int res = MultiByteToWideChar(CP_UTF8,flags,"test\xFF\xFF test",-1,buf,sizeof(buf));
int err = GetLastError();

std::cout << "result = " << res << " get last error = " << err;

现在，在XP上，如果忽略非法是真正的字符，我会得到：

result = 10 get last error = 0

如果忽略非法，我会得到

result = 0 get last error = 1113 // invalid code

所以，给定足够大的缓冲区就足以检查结果！= 0;

根据文件http://msdn.microsoft.com/en-us/library/dd319072(VS.85).aspx 有API更改，那么这在Vista上如何变化？

Answer 1

我认为它的作用是用Unicode标准强制要求用替换字符（U + FFFD）替换非法代码单元。以下代码

#define STRICT
#define UNICODE
#define NOMINMAX
#define WIN32_LEAN_AND_MEAN

#include <windows.h>

#include <cstdlib>
#include <iostream>
#include <iomanip>


void test(bool ignore_illegal) {
    const DWORD flags = ignore_illegal ? 0 : MB_ERR_INVALID_CHARS;
    WCHAR buf[0x100];
    SetLastError(0);
    const int res = MultiByteToWideChar(CP_UTF8, flags, "test\xFF\xFF test", -1, buf, sizeof buf);
    const DWORD err = GetLastError();
    std::cout << "ignore_illegal = " << std::boolalpha << ignore_illegal
        << ", result = " << std::dec << res
        << ", last error = " << err
        << ", fifth code unit = " << std::hex << static_cast<unsigned int>(buf[5])
        << std::endl;
}


int main() {
    test(false);
    test(true);
    std::system("pause");
}

在我的Windows 7系统上生成以下输出：

ignore_illegal = false, result = 0, last error = 1113, fifth code unit = fffd
ignore_illegal = true, result = 12, last error = 0, fifth code unit = fffd

因此错误代码保持不变，但长度偏离2，表示已插入的两个替换代码点。如果你在XP上运行我的代码，如果两个非法代码单元被丢弃，第五个代码点应该是U + 0020（空格字符）。

Answer 2

WCHAR *pstrRet = NULL;

int nLen = MultiByteToWideChar(CP_UTF8, 0, pstrTemp2, -1, NULL, 0);

pstrRet = new WCHAR[nLen];

int nConv = MultiByteToWideChar(CP_UTF8, 0, pstrTemp2, -1, pstrRet, nLen);

if (nConv == nLen)

{

// Success! pstrRet should be the wide char equivelant of pstrTemp2

}

if (pstrRet)

delete[] pstrRet;

我认为这是在某些论坛上发现的vista上使用它的方式:)

MultiByteToWideChar API在Vista上发生了变化

2 个答案: