我想要一个选项,将字符串转换为具有两种不同行为的宽字符串:
在Windows XP上,我可以这样做:
bool ignore_illegal; // input
DWORD flags = ignore_illegal ? 0 : MB_ERR_INVALID_CHARS;
SetLastError(0);
int res = MultiByteToWideChar(CP_UTF8,flags,"test\xFF\xFF test",-1,buf,sizeof(buf));
int err = GetLastError();
std::cout << "result = " << res << " get last error = " << err;
现在,在XP上,如果忽略非法是真正的字符,我会得到:
result = 10 get last error = 0
如果忽略非法,我会得到
result = 0 get last error = 1113 // invalid code
所以,给定足够大的缓冲区就足以检查结果!= 0;
根据文件http://msdn.microsoft.com/en-us/library/dd319072(VS.85).aspx 有API更改,那么这在Vista上如何变化?
答案 0 :(得分:3)
我认为它的作用是用Unicode标准强制要求用替换字符(U + FFFD)替换非法代码单元。以下代码
#define STRICT
#define UNICODE
#define NOMINMAX
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#include <cstdlib>
#include <iostream>
#include <iomanip>
void test(bool ignore_illegal) {
const DWORD flags = ignore_illegal ? 0 : MB_ERR_INVALID_CHARS;
WCHAR buf[0x100];
SetLastError(0);
const int res = MultiByteToWideChar(CP_UTF8, flags, "test\xFF\xFF test", -1, buf, sizeof buf);
const DWORD err = GetLastError();
std::cout << "ignore_illegal = " << std::boolalpha << ignore_illegal
<< ", result = " << std::dec << res
<< ", last error = " << err
<< ", fifth code unit = " << std::hex << static_cast<unsigned int>(buf[5])
<< std::endl;
}
int main() {
test(false);
test(true);
std::system("pause");
}
在我的Windows 7系统上生成以下输出:
ignore_illegal = false, result = 0, last error = 1113, fifth code unit = fffd
ignore_illegal = true, result = 12, last error = 0, fifth code unit = fffd
因此错误代码保持不变,但长度偏离2,表示已插入的两个替换代码点。如果你在XP上运行我的代码,如果两个非法代码单元被丢弃,第五个代码点应该是U + 0020(空格字符)。
答案 1 :(得分:0)
WCHAR *pstrRet = NULL;
int nLen = MultiByteToWideChar(CP_UTF8, 0, pstrTemp2, -1, NULL, 0);
pstrRet = new WCHAR[nLen];
int nConv = MultiByteToWideChar(CP_UTF8, 0, pstrTemp2, -1, pstrRet, nLen);
if (nConv == nLen)
{
// Success! pstrRet should be the wide char equivelant of pstrTemp2
}
if (pstrRet)
delete[] pstrRet;
我认为这是在某些论坛上发现的vista上使用它的方式:)