正则表达式:匹配并替换由分号分隔的其他实体行中的所有X400地址(由分号分隔)

时间:2016-10-10 18:10:39

标签: regex powershell

我正在尝试解析公司目录的导出,但由于导出的分号处理而遇到问题。导出数据的每一行都包含用户的distinguishedName,然后是与该用户关联的一个或多个电子邮件地址(sip,smtp,x400)。我一直试图找出一个正则表达式,我可以使用它来匹配一行中的所有x400地址,然后用逗号替换x400地址中的分号。 x400地址以这种格式显示x400:c=us\;a= \;p=company\;o=Exchange\;s=lastName\;g=firstName\;仅在X400地址中替换分号将为我提供一个可以正确分隔的行,因此我可以使用脚本进一步解析数据。 这是我的导出数据:

CN=Doe\\, Jane,OU=Employee,OU=Production,OU=Users,DC=COMPANY,DC=LOC;sip:jdoe@company.com;smtp:jdoe@company-b.com;smtp:Jane.Doe@company.com;SMTP:JDoe@company.com;X400:c=us\;a= \;p=Company\;o=Exchange\;s=Doe\;g=Jane\;
CN=Smith\\, Mike,OU=Employee,OU=Production,OU=Users,DC=COMPANY,DC=LOC;sip:msmith@company.com;x400:c=us\;a= \;p=COMPANY\;o=Exchange\;s=Smith\;g=Mike\;;smtp:MSmith@company-b.com;smtp:Mike.Smith@company.com;X400:c=us\;a= \;p=COMPANY\;o=Exchange\;s=Smith\;g=Mike\;;SMTP:msmith@compnay.com;smtp:MmSmith@company.com;smtp:Mike.Smith@company.com;smtp:MSmith@company-b.com;smtp:Mike.Smith@company.com
CN=Jones\\, Barbara,OU=Employee,OU=Production,OU=Users,DC=COMPANY,DC=LOC;BJones@company.com;SMTP:BRJoenes@company.com;sip:BrJoes@company.com
CN=Bay\\, Matt,OU=Employee,OU=Production,OU=Users,DC=COMPANY,DC=LOC MBay@company.com;sip:MBay@company.com
CN=O'Connor\\, Sam,OU=Visitor,OU=Production,OU=Users,DC=COMPANY,DC=LOC;sip:SO'Connor@company.com;x400:c=us\;a= \;p=COMPANY\;o=Exchange\;s=O'Connor\;g=Sam\;;so'connor@company-b.com

我正在寻找一个正则表达式替换,导致导出数据看起来像这样...

CN=Doe\\, Jane,OU=Employee,OU=Production,OU=Users,DC=COMPANY,DC=LOC;sip:jdoe@company.com;smtp:jdoe@company-b.com;smtp:Jane.Doe@company.com;SMTP:JDoe@company.com;X400:c=us\,a= \,p=Company\,o=Exchange\,s=Doe\,g=Jane\,;
CN=Smith\\, Mike,OU=Employee,OU=Production,OU=Users,DC=COMPANY,DC=LOC;sip:msmith@company.com;x400:c=us\,a= \,p=COMPANY\,o=Exchange\,s=Smith\,g=Mike\,;smtp:MSmith@company-b.com;smtp:Mike.Smith@company.com;X400:c=us\,a= \,p=COMPANY\,o=Exchange\,s=Smith\,g=Mike\,;SMTP:msmith@compnay.com;smtp:MmSmith@company.com;smtp:Mike.Smith@company.com;smtp:MSmith@company-b.com;smtp:Mike.Smith@company.com
CN=Jones\\, Barbara,OU=Employee,OU=Production,OU=Users,DC=COMPANY,DC=LOC;BJones@company.com;SMTP:BRJoenes@company.com;sip:BrJoes@company.com
CN=Bay\\, Matt,OU=Employee,OU=Production,OU=Users,DC=COMPANY,DC=LOC MBay@company.com;sip:MBay@company.com
CN=O'Connor\\, Sam,OU=Visitor,OU=Production,OU=Users,DC=COMPANY,DC=LOC;sip:SO'Connor@company.com;x400:c=us\,a= \,p=COMPANY\,o=Exchange\,s=O'Connor\,g=Sam\,;so'connor@company-b.com

我正在使用PowerShell正则表达式。

3 个答案:

答案 0 :(得分:1)

使用类似的东西:

... -replace 'x400:([a-z]*=.*?\\;)*(;|$)'

答案 1 :(得分:1)

  

仅在X400地址中替换分号将为我提供一个可以正确分隔的行,因此我可以使用脚本来进一步解析数据。

您还可以在解析数据时考虑X400格式:

Get-Content data.txt |ForEach-Object {
    $DN,$AddressString = $_ -split ';',2

    New-Object psobject -Property @{
        DistinguishedName = $DN
        Addresses = $AddressString -split ';(?=\w+:)'
    }
}

答案 2 :(得分:0)

我会使用正则表达式来替换子字符串:

$callback = {  
    Param
    (
        $match
    ) 

    '{0}' -f ($match.Groups[1].Value -replace ';', ',')
}

$txt = 'CN=Doe\\, Jane,OU=Employee,OU=Production,OU=Users,DC=COMPANY,DC=LOC;sip:jdoe@company.com;smtp:jdoe@company-b.com;smtp:Jane.Doe@company.com;SMTP:JDoe@company.com;X400:c=us\;a= \;p=Company\;o=Exchange\;s=Doe\;g=Jane\;'

$rex = [regex]'(X400:.*?g=.+?\\)'
$rex.Replace($txt, $callback)