将文件中的数据提取到另一个文件中

时间:2014-07-22 10:18:30

标签: regex perl

我想提取一个文件,其中包含有关姓名,地址,电话等人的数据信息。该文件可能如下所示:

Name:John
FirstName:Smith
Address:Main Street
Phone:32674632
Name:Alice
FirstName:Meyers
Address:Forth Av.
Phone:273267462

但有时地址写在两行:

Name:John
FirstName:Smith
Address:Main Street
Phone:32674632
Name:Alice
FirstName:Meyers
Address:Forth Av.
street 54
Phone:273267462

我写了这个:

while (<INPUT>) {
        chomp;
        if (/^Name/) {
            ($match) = /Name:(.*)/;
            $string = $string.$match." ";
            next;
        } 
        if (/^FirstName/) {
            ($match) = /FirstName:(.*)/;
            $string = $string.$match." ";
            next;
        } 
        if (/^Address/) {
            ($match) = /Address:(.*)/;
            $string = $string.$match." ";
            next;
        } 
        if (/Address:(.*)$Phone/) {
            ($match) = /Address:(.*)$phone/;
            $string = $string.$match." ";
            next;
        }
        if (/^Phone/) {
            ($match) = /Phone:(.*)/;
            $string = $string.$match." ";
            print OUTPUT "$string\n";
            $string = "";
            next;
        } 
}

任何人都可以帮我找到解决这些问题的方法吗?

3 个答案:

答案 0 :(得分:0)

试试这个

if (/^Address/) {
        ($match) = /Address:(.*)$Phone/;
        $string = $string.$match." ";
        next;
    } 

如果您的文件包含与地址相同的格式,则下一条记录为phone,然后编写表达式以搜索Address和phone之间的完整字符串。 /^Address:.*$Phone/

然后搜索/^Address.* [\s.*:]$/

以[space any char:]开头的地址和结束。它必须要工作。

答案 1 :(得分:0)

@array=split /\nName/,<INPUT>;
while my $vals (@array) {
    chomp($vals);
    my @spls = ($1,$2,$3,$4) if($_=~/:(.*?)FirstName:(.*?)Address:(.*?)Phone:(.*)/);
    chomp(@spls);
    my $string = join(" ",@spls);
    print OUTPUT "$string\n";          
}

答案 2 :(得分:0)

#!/usr/bin/perl -w

use warnings;
use strict;

use Data::Dumper;

my $key = '';
my %person = ();

while (<>) {                            # for each line
    chomp;
    if (/^(.+?):(.*)/) {                # Do we have a key followed by a colon?
        $key = $1;                      # the key is the part before the colon
        $person{$key} = $2;             # The value is the part after the colon
        if ($key eq 'Phone') {          # A line with a 'Phone' key is the last one for a person
            print Dumper(\%person);     # Dump our person
            $key = '';
            %person = ();               # Start a new person
        }
    } else {
        $person{$key} .= " $_";         # No key means just append this line to the value for the previous key
    }
}