Question

我正在尝试找出从远程计算机以统一方法检索unicode数据的最安全方法，并确保数据保持一致和可读。

计算机A：中文用户，混合英文Windows 7，某些注册表值包含L“您好”等中文字母

计算机B：美国英语，我的函数没有返回unicode值

计算机C：向计算机A和B介绍代理。

代理：从内部评估计算机的健康和安全。一个unicode感知部分只是获取注册表值，即：

int Utilities::GetRegistryStringValue(HKEY h_sub_key, WCHAR* value_name, wstring &result)
{
DWORD cbData = 8;
LPDWORD type = NULL;

//Get the size and type of the key
long err = RegQueryValueEx(h_sub_key, value_name, NULL, type, NULL, &cbData);

if (err != ERROR_SUCCESS)
{
    if (err != ERROR_FILE_NOT_FOUND)
        debug->DebugMessage(Error::GetErrorMessageW(err));
    return err;
}

result.resize(cbData / sizeof(WCHAR));

LPWSTR res = new WCHAR[(cbData + sizeof(L'\0')) / sizeof(WCHAR)];

err = RegQueryValueEx(h_sub_key, value_name, NULL, NULL, (LPBYTE) &res[0], &cbData);

if(err != ERROR_SUCCESS && err != ERROR_FILE_NOT_FOUND)
{
    debug->DebugMessage(Error::GetErrorMessageW(err));
    return err;
}

res[cbData / sizeof(WCHAR)] = L'\0';

result = wstring(res);

return ERROR_SUCCESS;

}

这些值将存储在XML文件中。该XML文件应该是UTF16还是UTF8？我是否需要通过远程系统的代码页进行翻译？我可能还有其他什么问题？

Answer 1

UTF8更标准（用于网络），因为它没有端序问题。对于UTF16，您需要为传输指定一个字节序。如果您使用的是unicode格式，则不需要代码页。

如果它们位于Windows机器上，您可以使用WideCharToMultiByte之类的标准Windows调用进行翻译。

std::wstring buffer_with_utf16;
const char DefaultChar = 1; //not null, but not normal either
bool had_conversion_error = false;    
int alength = WideCharToMultiByte(CP_UTF8, 0, 
              buffer_with_utf16.cstr(), buffer_with_utf16.size(),
              NULL, 0, 
              &DefaultChar, &had_conversion_error);
if (alength == 0)
    throw std::logic_error("Bad UTF8 conversion"); //use GetLastError
std::string buffer_with_utf8(alength+1);
int error = WideCharToMultiByte(CP_UTF8, 0, 
              buffer_with_utf16.cstr(), buffer_with_utf16.size(),
              &buffer_with_utf8[0], buffer_with_utf8.size(), 
              &DefaultChar, &had_conversion_error);
if (error == 0)
    throw std::logic_error("Bad UTF8 conversion"); //use GetLastError

跨网络的Unicode到xml

1 个答案: