模式匹配分组

时间:2014-07-08 09:52:49

标签: perl

我有一个带有如下行的文件(只粘贴了两行)。这是一个巨大的文件1 GB。

6901:2014-06-30 12:24:58,584 INFO                                       BAS_Connector-thread-3 [com.orga.oslee.ra.bas.BASservice-R2.5.0.19.4] API=[BAS_SEARCH_ACCOUNTS] TID=[6c3be5be-2188-11e4-e004-6e5c369d1712] < received from Bas: [Accounts=[], MaxRecordsReached=false, ReferenceId=-9MXF14E0ST6R, IST=2014-06-30T12:24:58.578+05:30]
BAS_EXECUTIONTIME=[00:00:00:19] TID=[6c3be5be-2188-11e4-e004-6e5c369d1712] < received from Bas: {BAS_TID=[0:[6c3be5be-2188-11e4-e004-6e5c369d1712]], BAS_SERVICE=[0:[BAS_ADD_CUSTOMER]], BAS_Customer=[0:[{BAS_CustomerAttributes=[0:[{}]], BAS_NwOperator=[0:[Entel]], BAS_LastModDateTime=[0:[Mon Jun 30 12:24:58 IST 2014]], BAS_ComProfileCodeId=[0:[{BAS_Value=[0:[COM_PROF]], BAS_Domain=[0:[SEGMENT]]}]], BAS_CustomerCategory=[0:[Default]], BAS_AccountCategoryCodeId=[0:[{BAS_Value=[0:[Default]], BAS_Domain=[0:[CUSTOMER]]}]], BAS_OperatorCodeId=[0:[{BAS_Value=[0:[Entel]], BAS_Domain=[0:[CUSTOMER]]}]], BAS_CustomerName=[0:[CU_132539]], BAS_CustomerNumber=[0:[132502]], BAS_OwningCostCenterCodeId=[0:[{BAS_Value=[0:[CST_OWN]], BAS_Domain=[0:[SEGMENT]]}]], BAS_SegmentCodeId=[0:[{BAS_Value=[0:[Account]], BAS_Domain=[0:[CUSTOMER]]}]], BAS_DealerCodeId=[0:[{BAS_Value=[0:[Entel]], BAS_Domain=[0:[SEGMENT]]}]], BAS_BillingProfile=[0:[{BAS_BillFormatCodeId=[0:[{BAS_Value=[0:[w_it_bill]], BAS_Domain=[0:[SEGMENT]]}]], BAS_BillDispatchCodeId=[0:[{BAS_Value=[0:[OIT]], BAS_Domain=[0:[SEGMENT]]}]], BAS_InvoiceCurrencyCodeId=[0:[{BAS_Value=[0:[CLP]], BAS_Domain=[0:[SEGMENT]]}]], BAS_BillPeriodCodeId=[0:[{BAS_Value=[0:[BR01]], BAS_Domain=[0:[SEGMENT]]}]], BAS_Domain=[0:[SEGMENT]]}]], BAS_ScProfileCodeId=[0:[{BAS_Value=[0:[SER_C_PROF]], BAS_Domain=[0:[SEGMENT]]}]], BAS_CustomerId=[0:[132502]], BAS_ReceivingCostCenterCodeId=[0:[{BAS_Value=[0:[CST_RECV]], BAS_Domain=[0:[SEGMENT]]}]]}]], BAS_ReferenceId=[0:[-9MXF14E0ST6R]]}

我想从-9MXF14E0ST6RBAS_ReferenceId=[0:[-9MXF14E0ST6R]]

取值ReferenceId=-9MXF14E0ST6R,
my $receivedFrom =qr /^(\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2},\d{3}).*API=\[(.*?)\].*?TID=\[(.*?)\].*?received from Bas.*?(ReferenceId=.*?,|BAS_ReferenceId=\[.*?:\[(-.*?)\]\])/;

open (IN,'<',"$file");
while (my $inputString = <IN>)
{
if ($inputString=~/$receivedFrom/)
{

my ($starttime,$api,$receivingfromtid,$ReferenceId)=$inputString=~/$receivedFrom/; 
$ReferenceId=~s/ReferenceId=//;
$ReferenceId=~s/BAS_\[0:\[//;
$ReferenceId=~s/\]//g;
print "$ReferenceId\n";
}

}

我不想做替补。

1 个答案:

答案 0 :(得分:0)

简化正则表达式。

在这种情况下,您尝试匹配两种不同的模式。因此,它们应该是两个不同的正则表达式:

use strict;
use warnings;

while (<DATA>) {
    if (/\bReferenceId=([^,]*)/) {
        print "RefID = '$1'\n";

    } elsif (/\bBAS_ReferenceId=\[ [^\[\]]* \[ ([^\[\]]*) \] \]/x) {
        print "BasID = '$1'\n";
    }
}

__DATA__
6901:2014-06-30 12:24:58,584 INFO                                       BAS_Connector-thread-3 [com.orga.oslee.ra.bas.BASservice-R2.5.0.19.4] API=[BAS_SEARCH_ACCOUNTS] TID=[6c3be5be-2188-11e4-e004-6e5c369d1712] < received from Bas: [Accounts=[], MaxRecordsReached=false, ReferenceId=-9MXF14E0ST6R, IST=2014-06-30T12:24:58.578+05:30]
BAS_EXECUTIONTIME=[00:00:00:19] TID=[6c3be5be-2188-11e4-e004-6e5c369d1712] < received from Bas: {BAS_TID=[0:[6c3be5be-2188-11e4-e004-6e5c369d1712]], BAS_SERVICE=[0:[BAS_ADD_CUSTOMER]], BAS_Customer=[0:[{BAS_CustomerAttributes=[0:[{}]], BAS_NwOperator=[0:[Entel]], BAS_LastModDateTime=[0:[Mon Jun 30 12:24:58 IST 2014]], BAS_ComProfileCodeId=[0:[{BAS_Value=[0:[COM_PROF]], BAS_Domain=[0:[SEGMENT]]}]], BAS_CustomerCategory=[0:[Default]], BAS_AccountCategoryCodeId=[0:[{BAS_Value=[0:[Default]], BAS_Domain=[0:[CUSTOMER]]}]], BAS_OperatorCodeId=[0:[{BAS_Value=[0:[Entel]], BAS_Domain=[0:[CUSTOMER]]}]], BAS_CustomerName=[0:[CU_132539]], BAS_CustomerNumber=[0:[132502]], BAS_OwningCostCenterCodeId=[0:[{BAS_Value=[0:[CST_OWN]], BAS_Domain=[0:[SEGMENT]]}]], BAS_SegmentCodeId=[0:[{BAS_Value=[0:[Account]], BAS_Domain=[0:[CUSTOMER]]}]], BAS_DealerCodeId=[0:[{BAS_Value=[0:[Entel]], BAS_Domain=[0:[SEGMENT]]}]], BAS_BillingProfile=[0:[{BAS_BillFormatCodeId=[0:[{BAS_Value=[0:[w_it_bill]], BAS_Domain=[0:[SEGMENT]]}]], BAS_BillDispatchCodeId=[0:[{BAS_Value=[0:[OIT]], BAS_Domain=[0:[SEGMENT]]}]], BAS_InvoiceCurrencyCodeId=[0:[{BAS_Value=[0:[CLP]], BAS_Domain=[0:[SEGMENT]]}]], BAS_BillPeriodCodeId=[0:[{BAS_Value=[0:[BR01]], BAS_Domain=[0:[SEGMENT]]}]], BAS_Domain=[0:[SEGMENT]]}]], BAS_ScProfileCodeId=[0:[{BAS_Value=[0:[SER_C_PROF]], BAS_Domain=[0:[SEGMENT]]}]], BAS_CustomerId=[0:[132502]], BAS_ReceivingCostCenterCodeId=[0:[{BAS_Value=[0:[CST_RECV]], BAS_Domain=[0:[SEGMENT]]}]]}]], BAS_ReferenceId=[0:[-9MXF14E0ST6R]]}

输出:

RefID = '-9MXF14E0ST6R'
BasID = '-9MXF14E0ST6R'