Question

我使用Visual Studio 2010在Windows 7上的非托管C ++应用程序中集成了Hunspell。

我有拼写检查和建议为英语工作，但现在我正在尝试为西班牙语工作，并遇到一些障碍。每当我收到西班牙语的建议时，带有重音字符的建议都无法正确转换为std::wstring个对象。

以下是从Hunspell->suggest方法返回的建议示例：

Hunspell->suggest(...) result

以下是我用来将std::string翻译为std::wstring

的代码

std::wstring StringToWString(const std::string& str)
{
    std::wstring convertedString;
    int requiredSize = MultiByteToWideChar(CP_UTF8, 0, str.c_str(), -1, 0, 0);
    if(requiredSize > 0)
    {
        std::vector<wchar_t> buffer(requiredSize);
        MultiByteToWideChar(CP_UTF8, 0, str.c_str(), -1, &buffer[0], requiredSize);
        convertedString.assign(buffer.begin(), buffer.end() - 1);
    }

    return convertedString;
}

在我经过这一切之后，我得到了这个，最后有一个时髦的角色。

After conversion to wstring

任何人都可以帮我弄清楚这里的转换可能会发生什么吗？我猜测它与hunspell返回的否定字符有关，但我不知道如何将其转换为std::wstring转换代码的内容。

Answer 1

看起来Hunspell的输出是ASCII，代码页为852。使用852而不是CP_UTF8 http://msdn.microsoft.com/en-us/library/windows/desktop/dd317756(v=vs.85).aspx

或者配置Hunspell返回UTF8。

Answer 2

看起来Hunspell的输出是ASCII，代码页为28591（ISO 8859-1 Latin 1;西欧（ISO）），我通过查看unix命令行实用程序的Hunspell默认设置找到了

将CP_UTF8更改为28591对我有用。

// Updated code page to 28591 from CP_UTF8
std::wstring StringToWString(const std::string& str)
{
    std::wstring convertedString;
    int requiredSize = MultiByteToWideChar(28591, 0, str.c_str(), -1, 0, 0);
    if(requiredSize > 0)
    {
        std::vector<wchar_t> buffer(requiredSize);
        MultiByteToWideChar(28591, 0, str.c_str(), -1, &buffer[0], requiredSize);
        convertedString.assign(buffer.begin(), buffer.end() - 1);
    }

    return convertedString;
}

以下是来自MSDN的list of code pages，它帮助我找到了正确的代码页整数。

使用特殊字符处理Hunspell建议

2 个答案: