将字符串存储在数组中

时间:2012-05-31 16:43:47

标签: regex perl

我目前正在名为google.txt的文件中存储以下行。我想分隔这些行并将这些分隔的字符串存储在数组中。

喜欢第一行

@qf_file= q33AgCEv006441  
@date =    Tue Apr  3 16:12
@junk_message = User unknown
@rf_number = ngandotra@nkn.in

the line ends at the @rf_number at last emailadress
   q33AgCEv006441     1038 Tue Apr  3 16:12 <test10-list-bounces@lsmgr.nic.in>
                     (User unknown)
                     <ngandotra@nkn.in>
    q33BDrP9007220    50153 Tue Apr  3 16:43 <karuvoolam-list-bounces@lsmgr.nic.in>
                     (Deferred: 451 4.2.1 mailbox temporarily disabled: paond.tndt)
                      <paond.tndta@nic.in>
    q33BDrPB007220    50153 Tue Apr  3 16:43 <karuvoolam-list-bounces@lsmgr.nic.in>
                     (User unknown)
                     paocorp.tndta@nic.in>
                                             <dtocbe@tn.nic.in>
                                             <dtodgl@nic.in>
    q33BDrPA007220    50153 Tue Apr  3 16:43 <karuvoolam-list-bounces@lsmgr.nic.in>
                     (User unknown)
                     <dtokar@nic.in>
                     <dtocbe@nic.in>
    q2VDWKkY010407  2221878 Sat Mar 31 19:37 <dhc-list-bounces@lsmgr.nic.in>
                     (host map: lookup (now-india.net.in): deferred)
                     <arjunpan@now-india.net.in>
    q2VDWKkR010407  2221878 Sat Mar 31 19:31 <dhc-list-bounces@lsmgr.nic.in>
                     (host map: lookup (aaplawoffices.in): deferred)
                      <amit.bhagat@aaplawoffices.in>
    q2U8qZM7026999   360205 Fri Mar 30 14:38 <dhc-list-bounces@lsmgr.nic.in>
                     (host map: lookup (now-india.net.in): deferred)
                      <arjunpan@now-india.net.in>
                       <amit.bhagat@aaplawoffices.in>
    q2TEWWE4013920  2175270 Thu Mar 29 20:30 <dhc-list-bounces@lsmgr.nic.in>
                     (host map: lookup (now-india.net.in): deferred)
                               <arjunpan@now-india.net.in>
                               <amit.bhagat@aaplawoffices.in>

1 个答案:

答案 0 :(得分:1)

未经测试 Perl脚本:

让我们调用此脚本parser.pl

$file = shift;
open(IN, "<$file") or die "Cannot open file: $file for reading ($!)\n";
while(<IN>) {
    push(@qf_file, /^\w+/g); 
    push(@date, /(?:Sat|Sun|Mon|Tue|Wed|Thu|Fri)[\w\s:]+/g);
    push(@junk_message, /(?<=\().+(?=\)\s*<)/g);
    push(@rf_number, /(?<=<)[^>]+(?=>\s*$)/g);
}
close(IN);

这假设该行的最后一封电子邮件应该是该行的“rf_number”。请注意,电子邮件可能难以打印,因为它们具有@字符,并且perl非常乐意为您打印不存在的列表: - )

要在命令行中调用它:

parser.pl google.txt

查看此作品here