Question

我有以下搜索词：

Mode <- function(x) { ux <- unique(x) ux[which.max(tabulate(match(x, ux)))] } apply(sapply(do.call(c, Large_list), `[[`, "Pdist"), 1, Mode)

该术语是从params变量派生的，其中"login:17639 email:fakemail@gmail.com ref:co-10000 common_name:testingdomain organization:'Internet Company'"左侧的所有内容都是过滤条件，而:右侧的所有内容都是过滤条件的值。我想要做的是将术语分解为键和值，并从中创建哈希。这是最终目标：

我正在和search_filters = { login:17639, email:'fakemail@gmail.com', etc, etc, }一起获得这些值，但是我对组织领域有疑问。这是我到目前为止的内容：

split, gsub, tr

基本上，还有许多其他类似上述的变体。问题是组织领域。每次迭代都会导致类似term.gsub(/'/,'').tr(':', ' ').split(" ") term.gsub(":")的问题，问题是“ Internet Company”被拆分了。我不能仅仅为这个过滤器放置一个简单的if / else语句就将它们粘合在一起，因为还有更多的过滤器要处理。有没有一种方法可以简单地根据冒号简单地划分过滤条件？谢谢。

Answer 1

以下是如何开始的示例：

def splart(input)
  input.scan(/([^:]+):('[^']*'|"[^"]*"|\S+)/).to_h
end

这将梳理您需要的数据。之后，您可能需要清理它。

Answer 2

str = "login:17639 email:fakemail@gmail.com ref:co-10000 common_name:testingdomain organization:'Internet Company'"

如果在示例中，要用作键的字符串（在转换为符号后）不包含空格或单引号（此假设稍后会放宽），则可以执行以下操作：

r = /(?:\s+|\A)(\S+):/
a1 = str.split(r)
  #=> ["", "login", "17639", "email", "fakemail@gmail.com", "ref", "co-10000",
  #    "common_name", "testingdomain", "organization", "'Internet Company'"] 
a2 = a1.drop(1).map.with_index { |s,i| i.even? ? s.to_sym : s }
  #=> [:login, "17639", :email, "fakemail@gmail.com", :ref, "co-10000",
  #    :common_name, "testingdomain", :organization, "'Internet Company'"] 
h = a2.each_slice(2).to_h
  #=> {:login=>"17639", :email=>"fakemail@gmail.com", :ref=>"co-10000",
  #    :common_name=>"testingdomain", :organization=>"'Internet Company'"}

这些步骤当然可以链接在一起：

h = str.split(r).drop(1).map.with_index { |s,i| i.even? ? s.to_sym : s }.
        each_slice(2).to_h

最后，

h[:login] = h[:login].to_i
h #=> {:login=>17639, :email=>"fakemail@gmail.com", :ref=>"co-10000",
  #    :common_name=>"testingdomain", :organization=>"'Internet Company'"}

我们可以通过以 free-spacing模式编写正则表达式来进行自我记录：

r = /
    (?:    # begin a non-capture group
      \s+  # match > 0 whitespaces
      |    # or
      \A   # match the beginning of the string
    )      # end non-capture group
    (\S+)  # match > 0 non-whitespace characters in capture group 1
    :      # match a colon
    /x     # free-spacing regex definition mode

回想一下，当使用String#split时，如果要在其上进行拆分的字符串的一部分在捕获组中，则捕获组的内容将包含在返回的数组中。

如果要用作键的字符串（转换为符号后）也可以用单引号引起来，则可以按以下方式修改正则表达式：

r = /
    (?:        # begin a non-capture group
      \s+      # match > 0 whitespaces
      |        # or
      \A       # match the beginning of the string
    )          # end non-capture group    
    (?:        # begin a non-capture group
      '        # match a single quote
      ([^':]+) # match > 0 chars that are neither ' nor : in cap group 1
      '        # match a single quote
      |        # or              
      ([^':]+) # match > 0 chars that are neither ' nor : in cap group 2
    )        
    :          # match a colon
    /x         # free-spacing regex definition mode

str = "login:17639 'r ef':co-10000 organization:'Internet Company'"

str.split(r).drop(1).map.with_index { |s,i| i.even? ? s.to_sym : s }.
             each_slice(2).to_h
  #=> {:login=>"17639", :"r ef"=>"co-10000", :organization=>"'Internet Company'"}

如何用“：”分割搜索词

2 个答案: