正则表达式,用引号解析键值对

时间:2014-05-12 13:49:43

标签: ruby regex

我试图用Ruby解析键值表达式,格式如下:

foo-bar:bar foo:"bar 1" "bar 2":"foo 3" "bar 2 \"var\"":"foo 3"

应该屈服:

Key: foo-bar     Value: bar
Key: foo         Value: bar 1
Key: bar 2       Value: foo 3
Key: bar 2 "var" Value: foo 3

这是否可以使用正则表达式?键和值可以是不带引号的字符串 没有空格或带有空格的带引号的字符串。

我有以下内容:

([a-zA-Z0-9\-]+|\"[a-zA-Z0-9\-\s]+)\"\s*\:\s*([a-zA-Z0-9\-]+|\"[a-zA-Z0-9\-\s]+\")

4 个答案:

答案 0 :(得分:3)

这可以解决您的大多数问题:

("(?:\\.|[^"])*"|[^\s]*):\s*("(?:\\.|[^"])*"|[^\s]*)

rubular

更精细的选项是:

(?:"((?:\\.|[^"])*)"|([^\s]*)):\s*(?:"((?:\\.|[^"])*)"|([^\s]*))

哪个会在没有引号的情况下捕获,在ruby中它将如下所示:

string = 'foo-bar:bar foo:"bar 1" "bar : 2":"foo \" 3" "bar 2 \"var\"":"foo 3"'

string.scan(/(?:"((?:\\.|[^"])*)"|([^\s]*)):\s*(?:"((?:\\.|[^"])*)"|([^\s]*))/).map(&:compact)
# => [["foo-bar", "bar"], ["foo", "bar 1"], ["bar : 2", "foo \\\" 3"], ["bar 2 \\\"var\\\"", "foo 3"]]

rubular

答案 1 :(得分:0)

试试这个:

/([^:]*):\s*("[^"]*"|[^\s]*)/

see on rubular

你仍然需要删除引号。并且它不适用于值中的转义引号。

答案 2 :(得分:0)

s = 'foo-bar:bar foo:"bar 1" "bar 2":"foo 3" "bar 2 \"var\"":"foo 3"'

s.scan(/(?<!\\)"((?:[^"]|\\")*)(?<!\\)"|([^\s:]+)/).flatten.compact
.each_slice(2).to_h
# =>
# {
#   "foo-bar"           => "bar",
#   "foo"               => "bar 1",
#   "bar 2"             => "foo 3",
#   "bar 2 \\\"var\\\"" => "foo 3"
# }

答案 3 :(得分:0)

如果您可以使用非正则表达式解决方案,那么这个非常简单的解析器应该可以完成这项工作:

def kv(s)
state = :unquoted
result = [""]

  s.chars do |c|
    if state == :unquoted
      case c
      when ':', ' '
        if result.last.length > 0
          # next
          result << ""
        end
      when '"'
        state = :quoted
      else
        # write
        result.last << c
      end
    elsif state == :quoted
      case c
      when '"'
        # next
        result << ""
        state = :unquoted
      when '\\'
        state = :escaped
      else
        # write
        result.last << c
      end
    elsif state == :escaped
      #write
      result.last << c
      state = :quoted
    end
  end
  result.pop
  Hash[*result]
end

测试:

s = 'foo-bar:bar foo:"bar 1" "bar 2":"foo 3" "bar 2 \"var\"":"foo 3"'
kv s # => "foo-bar"=>"bar", "foo"=>"bar 1", "bar 2"=>"foo 3", "bar 2 \"var\""=>"foo 3"}