最干净的ruby代码,用于分割具有特定规则的字符串

时间:2011-01-21 15:50:35

标签: ruby regex arrays hash

想象一个像这样的数组

[
"A definition 1: this is the definition text",
"A definition 2: this is some other definition text",
"B definition 3: this could be: the definition text"
]

我想最终得到以下哈希

hash = {
:A => ["A definition 1", "this is the definition text", "A definition 2", "this is some other definition text"], 
:B => ["B definition 3", "this could be: the definition text"]
}

我正在创建一个词汇表,其中包含带有定义数组的字母表中每个字母的哈希值。

我对Ruby非常陌生,所以我看起来非常优雅,而且我正在努力研究冒号线上的分割正则表达式,以便第3行仅在第一次出现时分裂。

谢谢!

修改 这是我到目前为止所拥有的

def self.build(lines)
    alphabet = Hash.new()

    lines.each do |line|
      strings = line.split(/:/)
      letter = strings[0][0,1].upcase
      alphabet[letter] = Array.new if alphabet[letter].nil?
      alphabet[letter] << strings[0]
      alphabet[letter] << strings[1..(strings.size-1)].join.strip
    end
    alphabet
  end

2 个答案:

答案 0 :(得分:4)

提供raw_definitions是您的输入:

sorted_defs = Hash.new{|hash, key| hash[key] = Array.new;}

raw_definitions.each do |d|
  d.match(/^([a-zA-Z])(.*?):(.*)$/)
  sorted_defs[$1.upcase]<<$1+$2
  sorted_defs[$1.upcase]<<$3.strip
end

答案 1 :(得分:2)

只是为了好玩,这是一个纯功能的选择:

defs = [
  "A definition 1: this is the definition text",
  "A definition 2: this is some other definition text",
  "B definition 3: this could be: the definition text"
]

hash = Hash[
  defs.group_by{ |s| s[0].to_sym }.map do |sym,strs|
    [ sym, strs.map{ |s| s[2..-1].split(/\s*:\s*/,2) }.flatten ]
  end
]

require 'pp'
pp hash
#=> {:A=>
#=>   ["definition 1",
#=>    "this is the definition text",
#=>    "definition 2",
#=>    "this is some other definition text"],
#=>  :B=>["definition 3", "this could be: the definition text"]}

具有相同结果的非纯粹功能变体:

hash = defs.group_by{ |s| s[0].to_sym }.tap do |h|
  h.each do |sym,strs|
    h[sym] = strs.map{ |s| s[2..-1].split(/\s*:\s*/,2) }.flatten
  end 
end

请注意,由于s[0].to_sym的使用,这些解决方案仅适用于Ruby 1.9;要在1.8.7中工作,您必须将其更改为s[0,1].to_sym。要使第一个解决方案在1.8.6中运行,您还需要将Hash[ xxx ]替换为Hash[ *xxx.flatten ]