如何使用正则表达式进行多行搜索?

时间:2009-10-22 10:43:12

标签: .net regex multiline

我是正则表达式的新手。

我想做多行搜索。以下是我想要做的例子:

假设我有以下文字:

*Project #1:
CVC – Customer Value Creation (Sep 2007 – till now)
Time Warner Cable is the world's leading media and entertainment company, Time Warner Cable (TWC) makes coaxial quiver.
Client   : Time Warner Cable, US.
ETL Tool  : Informatica 7.1.4
Database  : Oracle 9i.
Role   : ETL Developer/Team Lead.
O/S   : UNIX.
Responsibilities:
Created Test Plan and Test Case Book.
Peer reviewed team members Mappings.
Documented Mappings.
Leading the Development Team.
Sending Reports to onsite.
Bug fixing for Defects, Data and Performance related.                                                                                                     
Project #2:
MYER – Sales Analysis system (Nov 2005 – till now)
            Coles Myer is one of Australia's largest retailers with more than 2,000 stores throughout Australia,
Client   : Coles Myer Retail, Australia.
ETL Tool  : Informatica 7.1.3
Database  : Oracle 8i.
Role   : ETL Developer.
O/S   : UNIX.
Responsibilities:
Extraction, Transformation and Loading of the data using Informatica.
Understanding the entire source system.                                                                                     
Created and Run Sessions and Workflows.
Created Sort files using Syncsort Application.*

我想写RegEx,它应该首先尝试匹配单词“Project”,它可以是小写或大写。

如果“project”匹配,则RegEx应尝试匹配客户端,角色和环境。 如果是RegEx。匹配其中任何一个,然后匹配完成。 (在任何情况下,客户,角色,环境也可以与“项目”一词相同或不同)

我为上面的任务编写了一个正则表达式,如下所示:

^((P|p)roject.*\s*.*((((E|e)nviornment)|((P|p)latform)|((R|r)ole(s)?)|((R|r)esponsibilit(y|ies))|((C|c)lient)|((C|c)ustomer)|((P|p)eriod)))

此RegEx。匹配项目#1但与项目#2不匹配。

有人可以告诉我这个RegEx有什么问题,或者如何为这种文本写RegEx?

3 个答案:

答案 0 :(得分:2)

试试这个:

Regex project = new Regex(
   @"^(Project [\s\S]*?" + 
   @"(Environment|Platform|Roles?|Responsibilit(y|ies)|Client|Customer|Period))",
   RegexOptions.ECMAScript | RegexOptions.IgnoreCase | RegexOptions.Multiline);

答案 1 :(得分:1)

对于C#,您可以将Multiline选项指定为Regex构造函数的参数:

Regex r = new Regex("(var matches = new Array\\([^\\)]*\\);)",  
          RegexOptions.IgnoreCase | RegexOptions.Compiled 
          | RegexOptions.Multiline);

有关更多代码详情,请参阅链接:C# and Regex: How to extract strings between quotation marks

答案 2 :(得分:0)

因为你没有指定编程语言,所以这里有一些常用的模式来完成这个

/yourRegexpattern/m  <-- the m stays for multiline

您也可以使用

/yourRegexpattern/im <-- the i stays for case insensitivity

删除那些(P|p)等的需要。

在C#中,你必须在正则表达式的构造函数中指定这些标志,只需使用自动完成。