使用RegEx提取具有可选字段的多条信息

时间:2015-02-23 21:45:42

标签: php regex

我想帮助您了解如何从文本文件中提取此信息。有一个可选的字段(此处标记为“S”)

文字如下:

  NAME     Case No. Duration PLAN       ACCT DATE ST DATE A MODE   AMOUNT CRATE S AccountTotal                     
  PETER             AB02651341 RN BUILDER IUL CTAT 02/05/15 02/05/15 01             380.00   0.0050            1.90               
  JOHNSON, DON A BF06010672 FY AGGVANT 15 NT1      02/02/15 02/01/15 01            83.04   0.0500            4.15             
  SARA             ZZ02659940 RN CUST GUAR          01/31/15 01/30/15 12        18,450.00- 0.0025            46.13-            
  MIKE              KH02979366 RN CUST GUAR        02/02/15 02/01/15 01             109.83   0.0025 .50         .14             

是否可以将其输出(在数组或其他结构中):

NAME    Case No.    Duration    PLAN    DATE ST DATE A  MODE    AMOUNT  CRATE   S   AccountTotal
PETER   AB02651341  RN  BUILDER IUL CTAT    02/02/15    02/05/2015  01  380.00  0.0050      1.90
JOHNSON, DON A  BF06010672  FY  AGGVANT 15 NT1  02/2/2015   02/01/15    01  83.04   0.0500      4.15
SARA    ZZ02659940  RN  CUST GUAR   01/31/2015  01/30/2015  12  -18,450.00  0.0025      -46.13
MIKE    KH02979366  RN  CUST GUAR   02/02/15    02/01/2015  01  109.83  0.0025  .50 .14

最终输出将是这样的:

Array ( [0] => Array ( [NAME] => PETER [Case No.] => AB02651341 [Duration] => RN [PLAN] => BUILDER IUL CTAT [DATE ST] => 02/02/15 [DATE A] => 02/05/2015 [MODE] => 01 [AMOUNT] => 380.00 [CRATE] => 0.0050 [S] => [AccountTotal] => 1.90 ) 
        [1] => Array ( [NAME] => JOHNSON, DON A [Case No.] => BF06010672 [Duration] => FY [PLAN] => AGGVANT 15 NT1 [DATE ST] => 02/2/2015 [DATE A] => 02/01/15 [MODE] => 01 [AMOUNT] => 83.04 [CRATE] => 0.0500 [S] => [AccountTotal] => 4.15 ) 
        [2] => Array ( [NAME] => SARA [Case No.] => ZZ02659940 [Duration] => RN [PLAN] => CUST GUAR [DATE ST] => 01/31/2015 [DATE A] => 01/30/2015 [MODE] => 12 [AMOUNT] => -18,450.00 [CRATE] => 0.0025 [S] => [AccountTotal] => -46.13 ) 
        [3] => Array ( [NAME] => MIKE [Case No.] => KH02979366 [Duration] => RN [PLAN] => CUST GUAR [DATE ST] => 02/02/15 [DATE A] => 02/01/2015 [MODE] => 01 [AMOUNT] => 109.83 [CRATE] => 0.0025 [S] => .50 [AccountTotal] => .14 ) )

2 个答案:

答案 0 :(得分:0)

您可以在正则表达式后使用?表示它是可选的。因此,如果XXX是该行早期部分的正则表达式,您可以写:

preg_match('/^XXX(?:\s+([\d.]+))?\s+([\d.]+)$/', $line, $match);

当未提供字段时,S字段的捕获组将为空。

答案 1 :(得分:0)

也许这会起作用?

$a = <<<EOT
  NAME     Case No. Duration PLAN       ACCT DATE ST DATE A MODE   AMOUNT CRATE S AccountTotal
  PETER             AB02651341 RN BUILDER IUL CTAT 02/05/15 02/05/15 01             380.00   0.0050            1.90
  JOHNSON, DON A BF06010672 FY AGGVANT 15 NT1      02/02/15 02/01/15 01            83.04   0.0500            4.15
  SARA             ZZ02659940 RN CUST GUAR          01/31/15 01/30/15 12        18,450.00- 0.0025            46.13-
  MIKE              KH02979366 RN CUST GUAR        02/02/15 02/01/15 01             109.83   0.0025 .50         .14
EOT;

$cols = array(
    'NAME'         => '\s+(.*?)',
    'Case No.'     => '\s+(\w\w\d{8})',
    'Duration'     => '\s(\w\w)',
    'PLAN'         => '\s+(.*?)',
    'DATE ST'      => '\s+(\d\d/\d\d/\d\d)',
    'DATE A'       => '\s+(\d\d/\d\d/\d\d)',
    'MODE'         => '\s+(\d\d)',
    'AMOUNT'       => '\s+(\-?.*?)',
    'CRATE'        => '\s+(\d+\.\d+)',
    'S'            => '\s+([\.\d]*)',
    'AccountTotal' => '\s+(\-?.*?)$',
);

$result = array();
foreach (explode(PHP_EOL, $a) as $row) {
    if (preg_match('#' . implode(array_values($cols)) . '#', $row, $matches)) {

        // Move any trailing dash to the front of AMOUNT and
        // AccountTotal (a bit hackish - could be improved :)
        $matches[8]  = preg_replace('/(.*)-$/', '-$1', $matches[8]);
        $matches[11] = preg_replace('/(.*)-$/', '-$1', $matches[11]);

        $result[] = array_combine(array_keys($cols), array_slice($matches, 1));
    }
}
print_r($result);

输出:

Array
(
    [0] => Array
        (
            [NAME] => PETER
            [Case No.] => AB02651341
            [Duration] => RN
            [PLAN] => BUILDER IUL CTAT
            [DATE ST] => 02/05/15
            [DATE A] => 02/05/15
            [MODE] => 01
            [AMOUNT] => 380.00
            [CRATE] => 0.0050
            [S] => 
            [AccountTotal] => 1.90
        )

    [1] => Array
        (
            [NAME] => JOHNSON, DON A
            [Case No.] => BF06010672
            [Duration] => FY
            [PLAN] => AGGVANT 15 NT1
            [DATE ST] => 02/02/15
            [DATE A] => 02/01/15
            [MODE] => 01
            [AMOUNT] => 83.04
            [CRATE] => 0.0500
            [S] => 
            [AccountTotal] => 4.15
        )

    [2] => Array
        (
            [NAME] => SARA
            [Case No.] => ZZ02659940
            [Duration] => RN
            [PLAN] => CUST GUAR
            [DATE ST] => 01/31/15
            [DATE A] => 01/30/15
            [MODE] => 12
            [AMOUNT] => -18,450.00
            [CRATE] => 0.0025
            [S] => 
            [AccountTotal] => -46.13
        )

    [3] => Array
        (
            [NAME] => MIKE
            [Case No.] => KH02979366
            [Duration] => RN
            [PLAN] => CUST GUAR
            [DATE ST] => 02/02/15
            [DATE A] => 02/01/15
            [MODE] => 01
            [AMOUNT] => 109.83
            [CRATE] => 0.0025
            [S] => .50
            [AccountTotal] => .14
        )
)