在什么时候我可以将数组传递回我的Rust程序以释放其内存?

时间:2016-01-06 14:51:34

标签: python rust ctypes ffi

我很难弄清楚我可以将我的Rust程序返回的BNG_FFIArray传递给它,以便释放它所分配的内存。

我的ctypes设置如下:

class BNG_FFITuple(Structure):
    _fields_ = [("a", c_uint32),
                ("b", c_uint32)]

class BNG_FFIArray(Structure):
    _fields_ = [("data", c_void_p),
                ("len", c_size_t)]

    # Allow implicit conversions from a sequence of 32-bit unsigned
    # integers.
    @classmethod
    def from_param(cls, seq):
        return seq if isinstance(seq, cls) else cls(seq)

    def __init__(self, seq, data_type = c_float):
        array_type = data_type * len(seq)
        raw_seq = array_type(*seq)
        self.data = cast(raw_seq, c_void_p)
        self.len = len(seq)

# A conversion function that cleans up the result value to make it
# nicer to consume.
def bng_void_array_to_tuple_list(array, _func, _args):
    res = cast(array.data, POINTER(BNG_FFITuple * array.len))[0]
    return res

convert_bng = lib.convert_vec_c
convert_bng.argtypes = (BNG_FFIArray, BNG_FFIArray)
convert_bng.restype = BNG_FFIArray
convert_bng.errcheck = bng_void_array_to_tuple_list

# this is the FFI function I'd like to call. It takes a BNG_FFIArray as its argument
drop_array = lib.drop_array 
drop_array.argtypes = (BNG_FFIArray,)


def convertbng(lons, lats):
    """ just a wrapper """
    return [(i.a, i.b) for i in iter(convert_bng(lons, lats))]

# pass values into the FFI rust function
convertbng([-0.32824866], [51.44533267])

这一切都正常,但是我不确定我应该在什么时候将我最初分配的数据返回到lib.convert_to_bng以回到FFI边界以便免费通过调用drop_array来确定其相关内存。

这是我的Rust结构和功能。

#[repr(C)]
pub struct Array {
    data: *const c_void,
    len: libc::size_t,
}

#[no_mangle]
pub extern "C" fn drop_array(arr: Array) {
    unsafe { Vec::from_raw_parts(arr.data as *mut u8, arr.len, arr.len) };
}

impl Array {
    unsafe fn as_f32_slice(&self) -> &[f32] {
        assert!(!self.data.is_null());
        slice::from_raw_parts(self.data as *const f32, self.len as usize)
    }
    unsafe fn as_i32_slice(&self) -> &[i32] {
        assert!(!self.data.is_null());
        slice::from_raw_parts(self.data as *const i32, self.len as usize)
    }

    fn from_vec<T>(mut vec: Vec<T>) -> Array {
        // Important to make length and capacity match
        // A better solution is to track both length and capacity
        vec.shrink_to_fit();

        let array = Array {
            data: vec.as_ptr() as *const libc::c_void,
            len: vec.len() as libc::size_t,
        };

        // Leak the memory, and now the raw pointer is the owner
        mem::forget(vec);

        array
    }
}


#[no_mangle]
pub extern "C" fn convert_vec_c(lon: Array, lat: Array) -> Array {
    // we're receiving floats
    let lon = unsafe { lon.as_f32_slice() };
    let lat = unsafe { lat.as_f32_slice() };
    // copy values and combine
    let orig = lon.iter()
                  .cloned()
                  .zip(lat.iter()
                          .cloned());
    // carry out the conversion
    let result = orig.map(|elem| convert_bng(elem.0 as f64, elem.1 as f64));
    // convert back to vector of unsigned integer Tuples
    let nvec = result.map(|ints| {
                         IntTuple {
                             a: ints.0 as u32,
                             b: ints.1 as u32,
                         }
                     })
                     .collect();
    Array::from_vec(nvec)
}

1 个答案:

答案 0 :(得分:7)

有两种方法可以在Python中管理资源,这两种方法都涉及创建一个对象:

这两个都涉及拥有一个控制/提供资源访问权限的管理器对象,该对象将运行不再需要该对象时所需的任何清理代码。对于这种情况,我认为第一个效果最好,但我会证明这两个。

对于我的示例,我将使用此Rust代码,其中Data是任何需要管理的资源(例如您的Array类型)的替身:

// ffi_example.rs
#![crate_type = "dylib"]

pub struct Data {
    x: i32
}

#[no_mangle]
pub extern fn data_create(x: i32) -> *mut Data {
    println!("Rust: creating: x = {}", x);
    Box::into_raw(Box::new(Data { x: x }))
}

// example function for interacting with the pointer
#[no_mangle]
pub unsafe extern fn data_get(p: *mut Data) -> i32 {
    (*p).x
}

#[no_mangle]
pub unsafe extern fn data_destroy(p: *mut Data) {
    let data = Box::from_raw(p);
    println!("Rust: destroying: x = {}", data.x);
}

可以使用rustc ffi_example.rs编译,以创建libffi_example.so(或类似,具体取决于平台)。这是我用于两种情况的Python代码的开始(可能需要调整CDLL调用):

import sys
import ctypes as c

class RawData(c.Structure):
    pass

lib = c.CDLL('./libffi_example.so')

create = lib.data_create
create.argtypes = [c.c_int]
create.restype = c.POINTER(RawData)

get = lib.data_get
get.arg_types = [c.POINTER(RawData)]
get.restype = c.c_int

destroy = lib.data_destroy
destroy.argtypes = [c.POINTER(RawData)]
destroy.restype = None

(请注意,通过指针连接,我不必告诉Python有关RawData内部的任何信息。)

您可以通过添加以下内容来检查所有内容是否正常工作:

p = create(10)
print('Python: got %s (at 0x%x)' % (get(p), c.addressof(p.contents)))
sys.stdout.flush()
destroy(p)

打印类似

的内容
Rust: creating: x = 10 (at 0x138b7c0)
Python: got 10 (at 0x138b7c0)
Rust: destroying: x = 10 (at 0x138b7c0)

flush是为了确保两种语言中的print以正确的顺序出现,因为它们具有不同的缓冲区。)

__del__

使用__del__只需创建一个Python对象(不是ctypes.Structure)作为Rust的接口,例如

class Data:
    def __init__(self, x):
         self._pointer = create(x)

    def get(self):
         return int(get(self._pointer))

    def __del__(self):
         destroy(self._pointer)

然后可以将其用作普通对象:

obj = Data(123)
print('Python: %s' % obj.get())
sys.stdout.flush()

obj2 = obj # two pointers to the same `Data`

obj = Data(456) # overwrite one
print('Python: %s, %s' % (obj.get(), obj2.get()))
sys.stdout.flush()

obj2 = None # just clear the second reference
print('Python: end')
sys.stdout.flush()

这将打印:

Rust: creating: x = 123 (at 0x28aa510)
Python: 123
Rust: creating: x = 456 (at 0x28aa6e0)
Python: 456, 123
Rust: destroying: x = 123 (at 0x28aa510)
Python: end
Rust: destroying: x = 456 (at 0x28aa6e0)

也就是说,Python可以告诉对象何时绝对不再有任何引用(例如,当obj覆盖obj2123时,或者当程序结束时, 456)。

上下文管理器

如果资源范围很大(在这种情况下可能不是这样),那么使用上下文管理器可能是有意义的,这将允许以下内容:

print('Python: before')
sys.stdout.flush()

with Data(789) as obj:
    print('Python: %s' % obj.get())
    sys.stdout.flush()
# obj's internals destroyed here

print('Python: after')
sys.stdout.flush()

这有点容易出错,因为对象的句柄可以保留在with语句之外,因此它必须检查这个,否则它可能会访问释放的内存。例如,

with Data(1234) as obj:
    pass
# obj's internals destroyed here

print(obj.get()) # oops...

无论如何,实施:

class Data:
    def __init__(self, x):
        self._x = x
        self._valid = False
    def __enter__(self):
        self._pointer = create(self._x)
        self._valid = False
        return self
    def __exit__(self):
        assert self._valid
        destroy(self._pointer)
        self._valid = False
        return False

    def get(self):
        if not self._valid:
            raise ValueError('getting from a destroyed Data')
        return int(get(self._pointer))

上面的第一个例子给出了输出:

Python: before
Rust: creating: x = 789 (at 0x1650530)
Python: 789
Rust: destroying: x = 789 (at 0x1650530)
Python: after

第二个给出:

Rust: creating: x = 1234 (at 0x113d450)
Rust: destroying: x = 1234 (at 0x113d450)
Traceback (most recent call last):
  File "ffi.py", line 82, in <module>
    print(obj.get()) # oops...
  File "ffi.py", line 63, in get
    raise ValueError('getting from a destroyed Data')
ValueError: getting from a destroyed Data

这种方法确实具有使资源有效/分配更清晰的代码区域的优势,实际上是Rust的基于RAII /范围的资源管理的手动形式。