Question

我编写了这个regexp来将字符串转换为HTML标记。它最后匹配[img foo]和第三个可选（左或右）参数。例如，[img foo left]。

/\[img (\S+)(\sleft|\sright)?\]/

但它也会在markdown内联代码和代码块中匹配这些标记。所以

````
  [img foo] # matches, but should not (it's inside a markdown code block
````
`[img foo]` # matches but should not match (inline code)

我在获取引用时遇到了同样的问题。这是完整的方法：

  def custom_image_tag(text)

    # look for image tag
    text.gsub(/\[img (\S+)(\sleft|\sright)?\]/) do
      id, css = $1, $2

      # check is second argument is a link
      # if yes use it in image tag
      if id =~ /http(s)?:\/\//
        image_tag id.strip, class: css

      # if no search doc to see if its value matches a reference
      # For example, [img foo] will match "[foo]: whatever.com"
      else
        text.match(/\[(#{id})\]: (.*)/) do |match|  # Same issue here
          image_tag match[2].strip, class: css
        end
      end
    end
  end

我想知道，有没有办法添加异常，或添加转义序列？解决这个问题的最佳方法是什么？

这是一个Rubular游乐场：http://rubular.com/r/b9ClAE6Rhj

Answer 1

如果匹配优先于标记的引号，则可以避免在引号内匹配标记。

quoted = /(?=```[^`]*```|`[^`]*`)/m
tagged = /\[img (\S+)(\sleft|\sright)?\]/
text.gsub(Regexp.union(quoted, tagged)) do
  if $1 then "" else
    ...
  end
end

或者，如果您想避免正则表达式变得复杂，那么您应该使用StringScanner。有了它，您可以将每个正则表达式置于(els)if条件下的单独情况下。

在ruby中添加regex模式的异常

1 个答案: