Question

我有一个字符串数组，例如：

arr = ['hello'; 'world'; 'hello'; 'again'; 'I----'; 'said-'; 'hello'; 'again']

如何提取最常见的字符串，在此示例中为'hello'？

Answer 1

第一步，使用单元格数组而不是字符串数组：

arr = {'hello', 'world'; 'hello', 'again'; 'I----', 'said-'; 'hello', 'again'};

其次，使用unique来获取唯一的字符串（这不适用于字符串数组，这就是我建议单元格的原因）：

[unique_strings, ~, string_map]=unique(arr);

然后在string_map变量上使用mode来查找最常见的值：

most_common_string=unique_strings(mode(string_map));

Answer 2

最好使用单元格数组和 regexp 函数;字符串数组的行为可能不是您所期望的。

arr = {'hello', 'world'; 'hello', 'again'; 'I----', 'said-'; 'hello', 'again'};

如果您使用

hellos = sum(~cellfun('isempty', regexp(arr, 'hello')));

它将返回单元格数组'hello'中arr的数量。