具有重复模式的正则表达式

时间:2015-06-12 23:16:38

标签: ruby regex

我在Ruby 1.8.7中遇到了正则表达式的问题。

字符串如下(对话一遍又一遍地播放):

recipients john smith (12345) albert martin (78348734) author john smith (12345) sent 2014-02-04 07:35:32 utc body hi what is your name recipients john smith (12345) albert martin (78348734) author albert martin (78348734) sent 2014-02-04 07:35:53 utc body my name is albert recipients john smith (12345) albert martin (78348734) author john smith (12345) sent 2014-02-04 07:36:57 utc body my name is john

我需要将字符串分成以下匹配项(请记住,对话可以继续 - “收件人是关键):

recipients john smith (12345) albert martin (78348734) author john smith (12345) sent 2014-02-04 07:35:32 utc body hi what is your name 

recipients john smith (12345) albert martin (78348734) author albert martin (78348734) sent 2014-02-04 07:35:53 utc body my name is albert 

recipients john smith (12345) albert martin (78348734) author john smith (12345) sent 2014-02-04 07:36:57 utc body my name is john

2 个答案:

答案 0 :(得分:1)

使用字符串拆分,这会将此字符串拆分为“收件人”字样。有关更多示例,请参阅http://www.dotnetperls.com/split-ruby

input = "recipients john smith (12345) albert martin (78348734) author john smith (12345) sent 2014-02-04 07:35:32 utc body hi what is your name recipients john smith (12345) albert martin (78348734) author albert martin (78348734) sent 2014-02-04 07:35:53 utc body my name is albert recipients john smith (12345) albert martin (78348734) author john smith (12345) sent 2014-02-04 07:36:57 utc body my name is john"
input.shift
values = input.split("recipients")

以后使用此数组时,请记住添加不属于数组的收件人。

答案 1 :(得分:0)

您可以使用String#scan使用正则表达式执行此操作,如下所示(其中str是您的字符串):

r = /
    recipients\s        # match text
    .+?                 # match one or more of any character, lazily (?)
    (?=recipients\s|\z) # positive look ahead for string or end of string
    /x                  # extended mode

    str.scan r
      #=> ["recipients john smith (12345) albert martin (78348734) author \
john smith (12345) sent 2014-02-04 07:35:32 utc body hi what is your name ",
      #    "recipients john smith (12345) albert martin (78348734) author \
albert martin (78348734) sent 2014-02-04 07:35:53 utc body my name is albert ",
      #    "recipients john smith (12345) albert martin (78348734) author \
john smith (12345) sent 2014-02-04 07:36:57 utc body my name is john"]