正则表达式捕获同一行上的字符

时间:2015-02-02 00:55:36

标签: regex

我有this regex,如果链接后没有空格,它会在同一行捕获内容。

正则表达式是:

(?:http\:\/\/)?(?:www\.)?ama?zo?n\.(?:com|ca|co\.uk|co\.jp|de|fr)/(?:exec/obidos/ASIN/|o/|gp/product/|(?:(?:[^"\'/]*)/)?dp/|)(B[A-Z0-9]{9})(?:(?:/|\?|\#)(?:[^"\'\s]*))?

我的预期输入是

[link](http://www.amazon.com/dp/B00CTUER1M)

Here is[a cool toy](http://www.amazon.com/dp/B00CTUER1M/ref=gb1h_img_e-4_8722_fb086345?smid=ATVPDKIKX0DER)!dddd fdsfsdfds

我希望输出为

[link](http://www.amazon.com/dp/B00CTUER1M?tag=affcode-20)

Here is[a cool toy](http://www.amazon.com/dp/B00CTUER1M?tag=affcode-20)!dddd fdsfsdfds

然而,对于第二个我得到了

Here is[a cool toy](http://amazon.com/dp/B00CTUER1M/?tag=affcode-20 fdsfsdfds

1 个答案:

答案 0 :(得分:2)

从它的外观来看,你从最后的负面角色类中遗漏了一个紧密的括号)

 # (?:http://)?(?:www\.)?ama?zo?n\.(?:com|ca|co\.uk|co\.jp|de|fr)/(?:exec/obidos/ASIN/|o/|gp/product/|(?:(?:[^"'/]*)/)?dp/|)(B[A-Z0-9]{9})(?:(?:/|\?|\#)(?:[^"'\s)]*))?

 (?: http:// )?
 (?: www \. )?
 ama? zo?n \.
 (?:
      com
   |  ca
   |  co \. uk
   |  co \. jp
   |  de
   |  fr 
 )
 /
 (?:
      exec/obidos/ASIN/
   |  o/
   |  gp/product/
   |  (?:
           (?: [^"'/]* )
           /
      )?
      dp/
   |  
 )
 ( B [A-Z0-9]{9} )             # (1)
 (?:
      (?: / | \? | \# )
      (?: [^"'\s)]* )               # <- Add ')' to negative class
 )?