我目前正在名为google.txt
的文件中存储以下行。我想分隔这些行并将这些分隔的字符串存储在数组中。
喜欢第一行
@qf_file= q33AgCEv006441
@date = Tue Apr 3 16:12
@junk_message = User unknown
@rf_number = ngandotra@nkn.in
the line ends at the @rf_number at last emailadress
q33AgCEv006441 1038 Tue Apr 3 16:12 <test10-list-bounces@lsmgr.nic.in>
(User unknown)
<ngandotra@nkn.in>
q33BDrP9007220 50153 Tue Apr 3 16:43 <karuvoolam-list-bounces@lsmgr.nic.in>
(Deferred: 451 4.2.1 mailbox temporarily disabled: paond.tndt)
<paond.tndta@nic.in>
q33BDrPB007220 50153 Tue Apr 3 16:43 <karuvoolam-list-bounces@lsmgr.nic.in>
(User unknown)
paocorp.tndta@nic.in>
<dtocbe@tn.nic.in>
<dtodgl@nic.in>
q33BDrPA007220 50153 Tue Apr 3 16:43 <karuvoolam-list-bounces@lsmgr.nic.in>
(User unknown)
<dtokar@nic.in>
<dtocbe@nic.in>
q2VDWKkY010407 2221878 Sat Mar 31 19:37 <dhc-list-bounces@lsmgr.nic.in>
(host map: lookup (now-india.net.in): deferred)
<arjunpan@now-india.net.in>
q2VDWKkR010407 2221878 Sat Mar 31 19:31 <dhc-list-bounces@lsmgr.nic.in>
(host map: lookup (aaplawoffices.in): deferred)
<amit.bhagat@aaplawoffices.in>
q2U8qZM7026999 360205 Fri Mar 30 14:38 <dhc-list-bounces@lsmgr.nic.in>
(host map: lookup (now-india.net.in): deferred)
<arjunpan@now-india.net.in>
<amit.bhagat@aaplawoffices.in>
q2TEWWE4013920 2175270 Thu Mar 29 20:30 <dhc-list-bounces@lsmgr.nic.in>
(host map: lookup (now-india.net.in): deferred)
<arjunpan@now-india.net.in>
<amit.bhagat@aaplawoffices.in>
答案 0 :(得分:1)
未经测试 Perl脚本:
让我们调用此脚本parser.pl
:
$file = shift;
open(IN, "<$file") or die "Cannot open file: $file for reading ($!)\n";
while(<IN>) {
push(@qf_file, /^\w+/g);
push(@date, /(?:Sat|Sun|Mon|Tue|Wed|Thu|Fri)[\w\s:]+/g);
push(@junk_message, /(?<=\().+(?=\)\s*<)/g);
push(@rf_number, /(?<=<)[^>]+(?=>\s*$)/g);
}
close(IN);
这假设该行的最后一封电子邮件应该是该行的“rf_number”。请注意,电子邮件可能难以打印,因为它们具有@
字符,并且perl非常乐意为您打印不存在的列表: - )
要在命令行中调用它:
parser.pl google.txt
查看此作品here。