Question

我有一个在API中使用utf-16的外部C库：作为函数参数，返回值和结构成员。

在Windows上可以使用ctypes.c_wchar_p，但在OSX ctypes下使用c_wchar中的UCS-32，我找不到支持utf-16的方法。

这是我的研究：

将_SimpleCData子类化用于redefine _check_retval_。
- 它允许将utf-16透明地转换为Python字符串。
- 可以作为C结构成员放置
- 但是它不允许将字符串作为参数处理，它的from_param()方法从未被调用过（为什么？）： func('str', b'W\x00B\x00\x00\x00') # passed without conversion
使用from_param()方法的自有类型。
- 优点：可以使用构造函数初始化，也可以在将字符串传递给函数时动态编码：
- 缺点：不能用作函数返回类型或结构成员。

这是：

ustr = myutf16('hello')
func(ustr)
func('hello')   # calls myutf16.from_param('hello')

Answer 1

您可以覆盖from_param子类中的c_char_p，将unicode字符串编码为UTF-16。您可以添加_check_retval_方法将UTF-16结果解码为unicode字符串。对于struct字段，您可以使用处理设置和获取属性的描述符类。将该字段设为_name类型的私有c_char_p，并将描述符设置为公共name。例如：

import sys
import ctypes

if sys.version_info[0] > 2:
    unicode = str

def decode_utf16_from_address(address, byteorder='little',
                              c_char=ctypes.c_char):
    if not address:
        return None
    if byteorder not in ('little', 'big'):
        raise ValueError("byteorder must be either 'little' or 'big'")
    chars = []
    while True:
        c1 = c_char.from_address(address).value
        c2 = c_char.from_address(address + 1).value
        if c1 == b'\x00' and c2 == b'\x00':
            break
        chars += [c1, c2]
        address += 2
    if byteorder == 'little':
        return b''.join(chars).decode('utf-16le')
    return b''.join(chars).decode('utf-16be')

class c_utf16le_p(ctypes.c_char_p):
    def __init__(self, value=None):
        super(c_utf16le_p, self).__init__()
        if value is not None:
            self.value = value

    @property
    def value(self,
              c_void_p=ctypes.c_void_p):
        addr = c_void_p.from_buffer(self).value
        return decode_utf16_from_address(addr, 'little')

    @value.setter
    def value(self, value,
              c_char_p=ctypes.c_char_p):
        value = value.encode('utf-16le') + b'\x00'
        c_char_p.value.__set__(self, value)

    @classmethod
    def from_param(cls, obj):
        if isinstance(obj, unicode):
            obj = obj.encode('utf-16le') + b'\x00'
        return super(c_utf16le_p, cls).from_param(obj)

    @classmethod
    def _check_retval_(cls, result):
        return result.value

class UTF16LEField(object):
    def __init__(self, name):
        self.name = name

    def __get__(self, obj, cls,
                c_void_p=ctypes.c_void_p,
                addressof=ctypes.addressof):
        field_addr = addressof(obj) + getattr(cls, self.name).offset
        addr = c_void_p.from_address(field_addr).value
        return decode_utf16_from_address(addr, 'little')

    def __set__(self, obj, value):
        value = value.encode('utf-16le') + b'\x00'
        setattr(obj, self.name, value)

示例：

if __name__ == '__main__': class Test(ctypes.Structure): _fields_ = (('x', ctypes.c_int), ('y', ctypes.c_void_p), ('_string', ctypes.c_char_p)) string = UTF16LEField('_string') print('test 1: structure field') t = Test() t.string = u'eggs and spam' print(t.string) print('test 2: parameter and result') result = None @ctypes.CFUNCTYPE(c_utf16le_p, c_utf16le_p) def testfun(string): global result print('parameter: %s' % string.value) # callbacks leak memory except for simple return # values such as an integer address, so return the # address of a global variable. result = c_utf16le_p(string.value + u' and eggs') return ctypes.c_void_p.from_buffer(result).value print('result: %s' % testfun(u'spam'))

<强>输出：

test 1: structure field eggs and spam test 2: parameter and result parameter: spam result: spam and eggs

如何在python ctypes中使用UTF-16？

1 个答案: