去正则表达式找撇号的单词

时间:2014-11-16 14:16:59

标签: regex go

我试图在两个单词之间找到一个子字符串,但我的起始单词包含一个撇号,我似乎无法匹配它。

例如,在下面的句子中

bus driver drove steady although the bus's steering was going nuts.

我的搜索的正确答案应该是:

steering was going nuts

而不是:

driver ... nuts

我试过这个

re := regexp.MustCompile("(?s)bus[\\\'].*?nuts")

我也试过这个:

re := regexp.MustCompile("(?s)bus'.*?nuts")

似乎无法使其发挥作用。

2 个答案:

答案 0 :(得分:2)

  

我的搜索的正确答案应该是"steering was going nuts" ...

如果您希望将子字符串作为匹配结果,则应相应地调整正则表达式。

re := regexp.MustCompile("(?s)bus's (.*?nuts)")
rm := re.FindStringSubmatch(str)
if len(rm) != 0 {
  fmt.Printf("%q\n", rm[0]) // "bus's steering was going nuts"
  fmt.Printf("%q",   rm[1]) // "steering was going nuts"
}

GoPlay

答案 1 :(得分:0)

您可以使用string literal(带后引号)以包含单引号和捕获组:

re := regexp.MustCompile(`(?s)bus'.\s+(.*?nuts)`)

请参阅this example

var source_txt = `bus driver drove steady although the bus's steering was going nuts.`

func main() {
    fmt.Printf("Experiment with regular expressions.\n")
    fmt.Printf("source text:\n")
    fmt.Println("--------------------------------")
    fmt.Printf("%s\n", source_txt)
    fmt.Println("--------------------------------")

    // a regular expression
    regex := regexp.MustCompile(`(?s)bus'.\s+(.*?nuts)`)
    fmt.Printf("regex: '%v'\n", regex)
    matches := regex.FindStringSubmatch(source_txt)
    for i, v := range matches {
        fmt.Printf("match %2d: '%s'\n", i+1, v)
    }
}

输出:

Experiment with regular expressions.
source text:
--------------------------------
bus driver drove steady although the bus's steering was going nuts.
--------------------------------
regex: '(?s)bus'.\s+(.*?nuts)'
match  1: 'bus's steering was going nuts'
match  2: 'steering was going nuts'

FindStringSubmatch()

  

识别s中正则表达式的最左边匹配及其子表达式的匹配(如果有)

match[1]将是第一个捕获组。