Question

我有一个文字，想要从中提取@begin = 'Some Text1'和@end = 'Some Text2'之间的所有子字符串。正则表达式使这项任务尽可能复杂。这样的Ruby中有一个简单的函数吗？

def substrings (text2SearchIn, begin, end)
  returns array of results
end

Answer 1

当使用正则表达式这么容易时，这很痛苦，但如果必须，这是一个非正则表达式解决方案：

str = "Now is the time @begin to see @end where @begin things @end stand."

append = nil
str.split.each_with_object([]) do |word, arr|
  case word
  when "@begin"
    append = [] unless append
  when "@end"
    arr << append unless append.nil? || append.empty?
    append = nil
  else
    append << word if append
  end
end.map { |arr| arr.join(' ') }
  #=> ["to see", "things"]

步骤：

append = nil
b = str.split
  #=> ["Now", "is", "the", "time", "@begin", "to", "see", "@end", "where",
  #    "@begin", "things", "@end", "stand."] 
c = b.each_with_object([]) do |word, arr|
  puts "word=#{word}, arr=#{arr}, append=#{append ? append : 'nil'}"
  case word
  when "@begin"
    append = [] unless append
    puts "  append set to []" unless append
  when "@end"
    puts "  #{arr} << #{append}" unless append.nil? || append.empty?
    arr << append unless append.nil? || append.empty?
    append = nil
    puts "  Now arr=#{arr}" unless append.nil? || append.empty?        
    puts "  append set to nil"
  else
    append << word if append
    puts "  '#{ word }' #{ append ? "added to append: append=#{append}" : "skipped" }"
  end
end
  #=> [["to", "see"], ["things"]]
c.map { |arr| arr.join(' ') }
  #=> ["to see", "things"]

打印的消息：

word=Now, arr=[], append=nil
  'Now' skipped
word=is, arr=[], append=nil
  'is' skipped
word=the, arr=[], append=nil
  'the' skipped
word=time, arr=[], append=nil
  'time' skipped
word=@begin, arr=[], append=nil
  append set to []
word=to, arr=[], append=[]
  'to' added to append: append=["to"]
word=see, arr=[], append=["to"]
  'see' added to append: append=["to", "see"]
word=@end, arr=[], append=["to", "see"]
  [] << ["to", "see"]
  append set to nil
word=where, arr=[["to", "see"]], append=nil
  'where' skipped
word=@begin, arr=[["to", "see"]], append=nil
  append set to []
word=things, arr=[["to", "see"]], append=[]
  'things' added to append: append=["things"]
word=@end, arr=[["to", "see"]], append=["things"]
  [["to", "see"]] << ["things"]
  append set to nil
word=stand., arr=[["to", "see"], ["things"]], append=nil
  'stand.' skipped

注意：

str = "I @begin to see @end where @begin things @end stand @begin to reason."
  #=> ["to see", "things"]
str = "I @begin to see @end where @end and @begin things @end stand to reason."
  #=> ["to see", "things"]
str = "I @begin to see @begin where @end and things @end stand to reason."
  #=> ["to see where"]

Answer 2

您可以使用String#index和循环执行此操作：

def substrings(text, begin_string, end_string)
  offset = 0
  strings = []
  while start_offset = text.index(begin_string, offset)
    contents_offset = start_offset + begin_string.size
    end_offset = text.index(end_string, contents_offset)
    strings << text[contents_offset...end_offset]
    offset = end_offset + end_string.size
  end
  strings
end

str = "1(2)34(5)()"
p substrings(str, "(", ")")  # => ["2", "5", ""]

如你所见，Cary Swoveland和我提出了不同的答案。他的回答特别对待空间并将它们分开。由于您的问题没有提供样本输入和输出，因此很难判断哪个答案更好。

如何在没有正则表达式的情况下提取子字符串

2 个答案: