Question

我在文本文件中有一些行有NA，我想删除它们。当我使用isempty(strfind(l,'NA'))时，这也删除了NA字符串，例如：'RNASE'，'GNAS'

例如

0.552353744371678    NA

0.0121476193502138   ANG;RNASE

0.189489997218949    GNAS

0.0911820441646675    MYCL1

输出：

0.0911820441646675     MYCL1

预期输出：

0.0121476193502138   ANG;RNASE

0.189489997218949    GNAS

0.0911820441646675    MYCL1

Answer 1

使用单一正则表达式我不知道如何找到

"NA that does not have any alphanumeric character before or after".

我的意思是，如果您知道前后至少会有一个其他角色，那就很容易了：

ind = regexp(str, '[^A-Za-z_]NA[^A-Za-z_]'); %Or something similar, depending what exactly can and cannot be there.

但是，此字符串需要前后字符，并且不会单独匹配单个“NA”。

也就是说，我几乎肯定存在合适的正则表达式，我只是不知道它:)

我会做的是（假设strl =包含文本的单行，您决定保留或删除，可能有多个NA）。

ind = regexp(strl, 'NA'); % This finds all NA in the string.
removestr = true;
for i = 1 : length(ind)
   if (ind == 1 || any(regexp(strl(ind-1), '[^A-Za-z_]'))) ... &&
      && (ind+1 == length(strl) || any(regexp(strl(ind+2), '[^A-Za-z_]')))
      disp('This is maybe the string to remove - if there are no wrong NA's later')
   else
      removestr = false;
      break; % stop checking in this loop, this string is to keep.
   end
end
if (removestr)
   disp('Remove string')
end

if中的条件有点矫枉过正且速度很慢，但应该有效。如果您不需要在一行中检查多个NA，只需省略for循环。

删除matlab中某些行的NA

1 个答案: