Question

我将尝试使用ctype模块从python调用c-interface。下面是C函数的原型

void UTF_to_Wide_char( const char* source, unsigned short* buffer, int bufferSize)

UTF_to_Wide_char：将UTF- *字符串转换为UCS2字符串

source（input）：包含一个以NULL结尾的UTF-8字符串

buffer（output）：指向将保存已转换文本的缓冲区的指针

bufferSize：表示缓冲区的大小，系统将复制到这个大小，包括NULL。

以下是我的python函数：

def to_ucs2(py_unicode_string):
    len_str = len(py_unicode_string)
    local_str = py_unicode_string.encode('UTF-8')
    src = c_wchar_p(local_str)
    buff = create_unicode_buffer(len_str * 2 )
    # shared_lib is my ctype loaded instance of shared library.
    shared_lib.UTF8_to_Widechar(src, buff, sizeof(buff))
    return buff.value

问题：上面的代码片段在使用ucs-4（--enable-unicode = ucs4选项）编译的python中工作正常，并且在使用UCS-2编译的python（--enable-unicode = ucs2）时会出现意外行为。（通过引用How to find out if Python is compiled with UCS-2 or UCS-4?）验证了python unicode编译选项

不幸的是在生产环境中我使用的是用UCS-2编译的python。请评论以下几点。

虽然我确定问题来自unicode选项，但我还没有确定引擎盖下发生的事情。需要帮助来提出所需的理由。
是否有可能克服此问题，而无需使用--enable-unicode = ucs4选项编译python？

（我对unicode编码很新。但是有一个基本的技术诀窍。）

使用UCS-2或UCS-4复制Python-ctype unicode处理和python？

0 个答案: