Question

我试图使用Ruby将索引返回到字符串中特定字符的所有出现位置。搜索"a#asg#sdfg#d##"个字符时，示例字符串为[1,5,10,12,13]，预期回报为#。以下代码完成了这项工作，但必须有一种更简单的方法吗？

def occurances (line)

  index = 0
  all_index = []

  line.each_byte do |x|
    if x == '#'[0] then
      all_index << index
    end
    index += 1
  end

  all_index
end

Answer 1

s = "a#asg#sdfg#d##"
a = (0 ... s.length).find_all { |i| s[i,1] == '#' }

Answer 2

require 'enumerator' # Needed in 1.8.6 only
"1#3#a#".enum_for(:scan,/#/).map { Regexp.last_match.begin(0) }
#=> [1, 3, 5]

ETA：这可以通过创建一个使用scan(/#/)作为其每种方法的枚举器来实现。

扫描产生指定模式的每次出现（在本例中为/#/），在块内部，您可以调用Regexp.last_match来访问匹配的MatchData对象。

MatchData#begin(0)返回匹配开始的索引，因为我们在枚举器上使用了map，我们得到了这些索引的数组。

Answer 3

这是一种不那么花哨的方式：

i = -1
all = []
while i = x.index('#',i+1)
  all << i
end
all

在快速测试中，这比FM的find_all方法快3.3倍，比sepp2k的enum_for方法快2.5倍。

Answer 4

这是一个很长的方法链：

"a#asg#sdfg#d##".
  each_char.
  each_with_index.
  inject([]) do |indices, (char, idx)|
    indices << idx if char == "#"
    indices
  end

# => [1, 5, 10, 12, 13]

需要1.8.7 +

Answer 5

另一个解决方案来自FMc的回答：

s = "a#asg#sdfg#d##"
q = []
s.length.times {|i| q << i if s[i,1] == '#'}

我喜欢Ruby从来没有一种做事方式！

Answer 6

这是解决大型字符串的方法。我正在4.5MB文本字符串上进行文本查找，而其他解决方案则停止了。这利用了ruby .split与字符串比较相比非常有效的事实。

def indices_of_matches(str, target)
      cuts = (str + (target.hash.to_s.gsub(target,''))).split(target)[0..-2]
      indicies = []
      loc = 0
      cuts.each do |cut|
        loc = loc + cut.size
        indicies << loc
        loc = loc + target.size
      end
      return indicies
    end

基本上，这是使用.split方法背后的功能，然后使用分开的部分和所搜索字符串的长度来确定位置。我已经从使用各种方法的30秒变为瞬时处理非常大的字符串。

我敢肯定有更好的方法，但是：

(str + (target.hash.to_s.gsub(target,'')))

在目标字符串的末尾添加一些内容（以拆分的方式进行工作），但还必须确保“随机”添加项不包含目标字符串本身。

indices_of_matches("a#asg#sdfg#d##","#")
=> [1, 5, 10, 12, 13]

返回ruby中字符串中所有字符出现的索引

6 个答案: