Question

我正在尝试使用MATLAB Coder工具箱将以下代码转换为C：

function [idx] = list_iterator(compare, list)

idx = nan(length(list));
for j = 1:length(list)
    idx(j) = strcmp(compare, list{j});
end

list是N x 1个单元格的字符串数组，而compare是一个字符串。该代码基本上将list与compare的每个元素进行比较，如果两者相同，则返回1，否则返回0。（我这样做是为了加快执行速度，因为N可能会很大-大约有10到2000万个元素。）

在命令窗口中运行codegen list_iterator时，出现以下错误：

函数“ list_iterator”的输入参数“ compare”的类型   未标明。使用-args或预处理语句来指定   输入类型。

更多信息

==> list_iterator行中的错误：1   列：18

代码生成失败：查看错误报告

使用代码生成器时出错

我知道我应该在使用codegen时指定输入的类型，但是我不确定如何对字符串的单元格数组执行此操作，其元素的长度可以不同。字符串compare也可以具有不同的长度，具体取决于函数调用。

Answer 1

您可以使用函数coder.typeof为codegen指定可变大小的输入。根据我对您的示例的了解，类似：

>> compare = coder.typeof('a',[1,Inf])

compare = 

coder.PrimitiveType
   1×:inf char
>> list = coder.typeof({compare}, [Inf,1])

list = 

coder.CellType
   :inf×1 homogeneous cell 
      base: 1×:inf char
>> codegen list_iterator.m -args {compare, list}

似乎合适。

如果您签出MATLAB Coder App，它将提供图形化的方式来指定这些复杂的输入。从那里，您可以将其导出到构建脚本以查看相应的命令行API：

https://www.mathworks.com/help/coder/ug/generate-a-matlab-script-to-build-a-project.html?searchHighlight=build%20script&s_tid=doc_srchtitle

请注意，当我使用codegen尝试此示例时，生成的MEX并不比MATLAB快。发生这种情况的原因之一是因为该函数的主体相当简单，但是大量数据从MATLAB传输到生成的代码，然后又返回。结果，这种数据传输开销会影响执行时间。将更多代码移至生成的MEX可能会改善这一点。

考虑与codegen无关的性能，您应该使用idx = false(length(list),1);而不是idx = nan(length(list));吗？前者是Nx1逻辑向量，而后者是NxN双重矩阵，我们只在list_iterator中写第一列。

使用您的原始代码和输入compare = 'abcd'; list = repmat({'abcd';'a';'b'},1000,1);，您可以得到时间：

>> timeit(@()list_iterator(compareIn, listIn))

ans =

    0.0257

修改您的代码以返回缩小的矢量：

function [idx] = list_iterator(compare, list)

idx = false(length(list),1);
for j = 1:length(list)
    idx(j) = strcmp(compare, list{j});
end

>> timeit(@()list_iterator(compareIn, listIn))

ans =

    0.0014

您还可以使用单元格和char数组调用strcmp，这会使代码运行得更快：

function [idx] = list_iterator(compare, list)

idx = strcmp(compare, list);

>> timeit(@()list_iterator(compareIn, listIn))

ans =

   2.1695e-05

使用Matlab Coder在字符串的单元格数组上进行字符串比较

1 个答案: