Question

从Python调用MATLAB必然会通过在Python中重写（很多）代码来减少性能。但是，这对我来说不是一个现实的选择，但令我很恼火的是，从numpy数组到MATLAB double的简单转换就是效率的巨大损失。

我正在谈论以下从data1到data1m的转换，其中

{{1}}

这里matlab.double来自Mathworks自己的MATLAB包/引擎。第二行代码在我的系统上花费了20秒，对于一个除了使数字“可食用”之外没有真正做任何其他事情的转换，这似乎太过分了。用于MATLAB。

所以基本上我正在寻找与给定here相反的技巧，该技巧可用于将MATLAB输出转换回Python。

Answer 1

有效传递numpy数组

查看文件夹mlarray_sequence.py中的文件PYTHONPATH\Lib\site-packages\matlab\_internal。在那里你会发现MATLAB数组对象的结构。性能问题来自于使用generic_flattening函数内的循环复制数据。

为了避免这种行为，我们会稍微编辑一下这个文件。此修复程序应该适用于复杂和非复杂的数据类型。

如果出现问题，请备份原始文件。
将import numpy as np添加到文件开头的其他导入

在第38行，你应该找到：

init_dims = _get_size(initializer)  # replace this with 
     try:
         init_dims=initializer.shape
     except:
         init_dims = _get_size(initializer)

在第48行你应该找到：

if is_complex:
    complex_array = flat(self, initializer,
                         init_dims, typecode)
    self._real = complex_array['real']
    self._imag = complex_array['imag']
else:
    self._data = flat(self, initializer, init_dims, typecode)

#Replace this with:

if is_complex:
    try:
        self._real = array.array(typecode,np.ravel(initializer, order='F').real)
        self._imag = array.array(typecode,np.ravel(initializer, order='F').imag)
    except:
        complex_array = flat(self, initializer,init_dims, typecode)
        self._real = complex_array['real']
        self._imag = complex_array['imag']
else:
    try:
        self._data = array.array(typecode,np.ravel(initializer, order='F'))
    except:
        self._data = flat(self, initializer, init_dims, typecode)

现在您可以将numpy数组直接传递给MATLAB数组创建方法。

data1 = np.random.uniform(low = 0.0, high = 30000.0, size = (1000000,))
#faster
data1m = matlab.double(data1)
#or slower method
data1m = matlab.double(data1.tolist())

data2 = np.random.uniform(low = 0.0, high = 30000.0, size = (1000000,)).astype(np.complex128)
#faster
data1m = matlab.double(data2,is_complex=True)
#or slower method
data1m = matlab.double(data2.tolist(),is_complex=True)

MATLAB阵列创建的性能提高了15倍，界面现在更容易使用。

Answer 2

在等待更好的建议时，我会发布迄今为止我提出的最佳技巧。它归结为使用`scipy.io.savemat'保存文件，然后在MATLAB中加载此文件。

这不是最讨厌的黑客，它需要一些小心，以确保依赖于相同脚本的不同进程不会最终编写并加载彼此的.mat文件，但性能增益对我来说是值得的。

作为测试用例，我编写了两个简单，几乎相同的MATLAB函数，这些函数需要2个numpy数组（我测试的长度为1000000）和一个int作为输入。

function d = test(x, y, fs_signal)
d = sum((x + y))./double(fs_signal);

function d = test2(path)
load(path)
d = sum((x + y))./double(fs_signal);

功能test需要转换，test2需要保存。

测试test：在我的系统上转换两个numpy数组需要大约40秒。准备和运行测试的总时间降至 170 s

测试test2：保存数组和int在我的系统上需要大约0.35秒。令人惊讶的是，在MATLAB中加载.mat文件是非常有效的（或者更令人惊讶的是，它在处理双打时非常低效）...准备和运行test2的总时间降至 0.38 s

这是一个近450倍的性能提升......

Answer 3

我的情况有点不同（从matlab调用python脚本），但是我将ndarray转换为array.array，大大加快了进程。基本上它与Alexandre Chabot解决方案非常相似，但无需更改任何文件：

#untested i.e. only deducted from my "matlab calls python" situation
import numpy
import array

data1 = numpy.random.uniform(low = 0.0, high = 30000.0, size = (1000000,))
ar = array.array('d',data1.flatten('F').tolist())
p = matlab.double(ar)
C = matlab.reshape(p,data1.shape) #this part I am definitely not sure about if it will work like that

至少如果从Matlab完成，“array.array”和“double”的组合相对较快。用Matlab 2016b + python 3.5.4 64bit测试。

提高将numpy数组转换为MATLAB的性能

3 个答案: