Question

我有以下字符串：

nothing to match
<-
this rocks should match as should this still and this rocks and still
->
should not match still or rocks
<- no matches here ->

我想找到'rock'和'still'的所有匹配，但只有当它们在＆lt; - - ＆gt;

之内时

目的是标记词汇表单词，但只能在编辑器定义的文本区域中标记它们。

我目前有：

<-.*?(rocks|still).*?->

不幸的是，这只匹配了第一个“岩石”并忽略了所有后续实例和所有“静止”

我在Rubular

中有这个

使用此功能将类似于

 Regexp.new( '<-.*?(' + self.all.map{ |gt| gt.name }.join("|") + ').*?->', Regexp::IGNORECASE, Regexp::MULTILINE )

提前感谢您提供任何帮助

Answer 1

使用单个正则表达式可能有一种方法可以做到这一点，但只需两步就可以更简单。首先匹配所有标记，然后在标记中搜索词汇表单词：

text = <<END
nothing to match
<-
this rocks should match as should this still and this rocks and still
->
should not match still or rocks
<- no matches here ->
END

text.scan(/<-.*?->/m).each do |match| 
    print match.scan(/rocks|still/), "\n"
end

另外，你应该注意，如果从来没有任何嵌套标记（<-...<-...->...->）并且没有转义<-或->，无论它是在内部还是在内，正则表达式只是一个很好的解决方案在标记之外。

Answer 2

不要忘记你的Ruby字符串方法。在考虑正则表达式之前先使用它们

$ ruby -0777 -ne '$_.split("->").each{|x| x.split("<-").each{|y| puts "#{y}" if (y[/rocks.*still/]) }   }' file

Answer 3

在Ruby中，它取决于你想用regexp做什么。您正在将正则表达式与字符串匹配，因此您将使用String methods。其中某些内容会对所有匹配产生影响（例如gsub或rpartition）;其他人只会对第一场比赛产生影响（例如rindex，=~）。

如果您正在使用后者中的任何一个（仅返回第一个匹配项），您将需要使用从某个偏移量开始再次调用该方法的循环。例如：

# A method to print the indices of all matches
def print_match_indices(string, regex)
  i = string.rindex(regex, 0)
  while !i.nil? do 
    puts i
    i = string.rindex(regex, i+1)
  end
end

（是的，您可以先使用split，但我希望像前面这样的正则表达式循环需要更少的系统资源。）

Ruby Regexp - 在标记内匹配多个结果

3 个答案: