匹配字符串,只要它不出现在另一个字符串中

时间:2013-05-22 18:28:19

标签: regex matlab

简短问题

是否可以使用正则表达式匹配不出现在其他更长字符串中的字符串? (我正在使用MATLAB)

实施例

shortString = {'hello', 'there'};
longString = {'why hello there', 'hello, my friend', 'hello and goodbye'};
wholeString = ['"why hello there", said one guy. The other guy over ' ...
'there said hello, my friend. hello and goodbye, said the ' ...
'third guy. So let''s all say hello!'];

在此示例中,我希望匹配shortStringwholeString的元素,只要它们不在wholeString中作为{{1}元素的子字符串出现}}。

在上面的示例中,我只想匹配longString中的'hello''So let''s all say hello!'中的'there'

具体问题

我正在编写一个函数,它将在函数顶部标记帮助注释,以便使用MATLAB的'The other guy over there'函数准备好发布(参见Publishing Markup)。除了标记函数的语法之外,我想标记函数的任何输入/输出,它们出现在语法之外。假设我以某种方式从函数中提取了帮助注释。以下是一些示例评论。

publish

用我当前的方法标记这些评论后,我有以下

% SYNTAX
%    x = somefunction(y)
%    [out1, out2] = somefunction(in1, in2)
%
% DESCRIPTION
%    x = somefunction(y) does something to y and returns x. I also dont wan't
%       your answer to match the 'y' in 'your' or ''y''.
%
%    [out1, out2] = somefunction(in1, in2) does something else with in1 and 
%       in2. Then it returns out1 and out2.

我想让每个输入和输出参数在Description部分中以单行间隔显示。我想最后的文字是

%% Syntax
%        x = somefunction(y)
%        [out1, out2] = somefunction(in1, in2)
%
%% Description
% |x = somefunction(y)| does something to y and returns x. I also dont wan't
% your answer to match the 'y' in your or ''y''.
%
% |[out1, out2] = somefunction(in1, in2)| does something else with in1 and 
% in2. Then it returns out1 and out2.

问题是我不能只匹配输入参数,或者我也会在已经是单声道间隔的语法部分匹配它们。

我可以使用包含每个语法定义的单元格数组,包含每个输入参数的单元格数组以及包含每个输出参数的单元格数组。

%% Syntax
%        x = somefunction(y)
%        [out1, out2] = somefunction(in1, in2)
%
%% Description
% |x = somefunction(y)| does something to |y| and returns |x|. I also dont wan't
% your answer to match the 'y' in 'your', or ''y''.
%
% |[out1, out2] = somefunction(in1, in2)| does something else with |in1| and 
% |in2|. Then it returns |out1| and |out2|.

1 个答案:

答案 0 :(得分:1)

这有点特别,我认为它可能会有所改进,但假设你的输入是:

txt = ['% SYNTAX' char(13)...
'%    x = somefunction(y)' char(13)...
'%    [out1, out2] = somefunction(in1, in2)' char(13)...
'%' char(13)...
'% DESCRIPTION' char(13)...
'%    x = somefunction(y) does something to y and returns x. I also dont wan''t' char(13)...
'%       your answer to match the ''y'' in ''your'' or ''''y''''.' char(13)...
'%' char(13)...
'%    [out1, out2] = somefunction(in1, in2) does something else with in1 and ' char(13)...
'%       in2. Then it returns out1 and out2.'];

inout  = {'y', 'in1', 'in2', 'x', 'out1', 'out2'}';

使用具有前瞻和后置运算符的正则表达式,即inputsoutputs应该包含在\s,\.\!\?;:%中,并且前面加上字母数字字符,或者是开头(没用?)或文字结尾:

expr = strcat('(?<=\w[\s,\.\!\?;:%]+|^)', inout,'(?=[\s,\.\!\?;:]+\w|\.?$)');
regexprep(txt,expr,'|$&|')

结果是:

ans =
% SYNTAX
%    x = somefunction(y)
%    [out1, out2] = somefunction(in1, in2)
%
% DESCRIPTION
%    x = somefunction(y) does something to |y| and returns |x|. I also dont wan't
%       your answer to match the 'y' in 'your' or ''y''.
%
%    [out1, out2] = somefunction(in1, in2) does something else with |in1| and 
%       |in2|. Then it returns |out1| and |out2|.

替代地

通过几个步骤,您可以匹配start的{​​{1}}和end个位置,然后检索syntax的位置,并排除替换操作中的位置inputs/outputs