从数组中提取重复元素(范围错误的错误值)

时间:2017-09-03 17:19:47

标签: arrays ruby

我想从这个数组中提取/获取标签“:desc:”和“:/ desc”之间的所有元素

array = ["hello", ":desc:", "claire", "et", "concise", ":/desc:",
         ":desc:", "claire", "caca", "concise", "test", ":/desc:"]

所以我有

new_array = [[":desc:", "claire", "et", "concise", ":/desc:"],
             [":desc:", "claire", "caca", "concise", "test", ":/desc:"]]

我试过

final_array = []

start_element = ':desc:'
end_element = ':/desc:'

while array.any?
  final_array << array.slice!
(array.find_index(start_element)..array.find_index(end_element))
end

但它显然无效,因为我收到bad value for range错误。

3 个答案:

答案 0 :(得分:4)

这里有几个问题。从您的示例数组看,结尾元素看起来像':/desc'而不是':/desc:'(即没有尾随:)。这可能只是问题上的一个错字。

主要问题是删除2个切片后,数组不会为空(它仍然包含"hello"之前的start_element。这意味着array.any?条件当find_index(start_element)找不到匹配的元素时,它仍然是真的。在这种情况下,find_index会返回nil,在尝试使用{no implicit conversion from nil to integer时会导致slice! 1}}。

如果您知道您的数据在匹配对中始终包含start_elementend_element,那么一种方法是:

while start_index = array.find_index(start_element)
  end_index = array.find_index(end_element)
  final_array << array.slice!(start_index..end_index)
end 

将来遇到这种错误时,一些可靠的puts调试会有所帮助,在这种情况下检查2个索引和数组的其余内容:

while array.any?
  start_index = array.find_index(start_element)
  end_index = array.find_index(end_element)
  puts "#{start_index}..#{end_index}"
  final_array << array.slice!(start_index..end_index)
  puts array.inspect
end

1..5
["hello", ":desc:", "claire", "caca", "concise", "test", ":/desc"]
1..6
["hello"]
..
TypeError: no implicit conversion from nil to integer
from (pry):146:in `slice!'

答案 1 :(得分:2)

您还可以使用Enumarable#slice_afterEnumarable#drop_while的组合:

array.slice_after(':/desc').map { |e| e.drop_while { |i| i != ':desc:' } }
#=> [[":desc:", "claire", "et", "concise", ":/desc"],
#    [":desc:", "claire", "caca", "concise", "test", ":/desc"]]

答案 2 :(得分:1)

我假设要提取以":desc:"开头并以":/desc"结尾且不包含":/desc"的其他实例的子数组。请注意,如果arr = [":desc:", ":desc:", ":/desc"],则会返回[a]。我没有对阵列的结构做出任何假设(但我没有测试过所有可能性)。如果做出某些假设(例如,存在匹配的,非重叠的对,则可以进行简化。

<强>代码

def extract(arr, target_start, target_end)
  arr.select { |s| (s == target_start)..(s == target_end) ? true : false }.
      slice_when { |s,t| [s, t] == [target_end, target_start] }.
      to_a.
      tap { |a| a.pop unless a.last.last == target_end }
end

<强>实施例

target_start = ":desc:"
target_end = ":/desc"

arr = ["hello", ":desc:", "claire", "et", "concise", ":/desc",
       ":desc:", "claire", "caca", "concise", "test", ":/desc"]
extract(arr, target_start, target_end)
  #=> [[":desc:", "claire", "et", "concise", ":/desc"],
  #    [":desc:", "claire", "caca", "concise", "test", ":/desc"]]

arr = ["hello", ":desc:", "claire", "et", "concise", ":/desc", "wanda",
       ":desc:", "claire", "caca", "concise", "test", ":/desc", "herb"]
extract(arr, target_start, target_end)
  # => [[":desc:", "claire", "et", "concise", ":/desc"],
  #     [":desc:", "claire", "caca", "concise", "test", ":/desc"]]

arr = ["hello", ":desc:", "claire", "et", "concise", ":/desc",
       ":desc:", "claire", "caca", "concise", "test"]
extract(arr, target_start, target_end)
  #=> [[":desc:", "claire", "et", "concise", ":/desc"]]

arr = ["hello", ":desc:", "claire", "et", "concise", ":desc:", "claire",
        "caca", "concise", "test"]
extract(arr, target_start, target_end)
  #=> []

<强>解释

考虑

arr = ["hello", ":desc:", "claire", "et", "concise", ":/desc",
       ":desc:", "claire", "caca", "concise", "test"]

以及示例中给出的target_starttarget_end。步骤如下。

b = arr.select { |s| (s == target_start)..(s == target_end) ? true : false }
  #=> [":desc:", "claire", "et", "concise", ":/desc", ":desc:", "claire",
  #    "caca", "concise", "test"]

第一步,利用Ruby的flip-flop operator,返回一个包含arr所有元素的数组,除了第一个":desc:"之前的元素和那些介于每个":/desc"和后面的第一个":desc:"之间。

接下来,我们使用Enumerable#slice_when(Ruby v2.2中的新增内容)生成一个根据需要切片b的枚举器,然后将该枚举器转换为数组。

c = b.slice_when { |s,t| [s, t] == [target_end, target_start] }
   #=> #<Enumerator: #<Enumerator::Generator:0x00000001dd4f18>:each>
d = c.to_a
   #=> [[":desc:", "claire", "et", "concise", ":/desc"],
   #    [":desc:", "claire", "caca", "concise", "test"]]

最后一步是移除d的最后一个数组,如果它不以":/desc"终止,这就是这里的情况。我们可以使用Array#pop,但不能直接返回弹出元素,这会导致方法返回该值。但是,如果我们在Object#tap块中使用它,一切都很好。

d.tap { |a| a.pop unless a.last.last == target_end }
  #=> [[":desc:", "claire", "et", "concise", ":/desc"]]