如何在Perl中解析多行记录?

时间:2018-10-03 10:05:17

标签: perl parsing

我正在尝试解析使用定界符'#'的字符串 该字符串有3行

101#Introduction to the Professor#SG_FEEL#QUE_NOIMAGE#
head up to the Great Hall and speak to the professor to check in for class.#
#

102#Looking for Instructors#SG_FEEL#QUE_NOIMAGE#
Look for the Battle Instructor.#
Talk to Battle Instructor#

103#Battle Instructor#SG_FEEL#QUE_NOIMAGE#
You have spoken to the Battle Instructor#
#

如何在定界符'#'之前获取每个值,这样我就可以制作一个新的格式,如下所示

[101] = {
    Title = "Introduction to the Professor",
    Description = {
        "head up to the Great Hall and speak to the professor to check in for class."
    },
    Summary = ""
},  
[102] = {
    Title = "Looking for Instructors",
    Description = {
        "Look for the Battle Instructor."
    },
    Summary = "Talk to Battle Instructor"
},
[103] = {
    Title = "Battle Instructor",
    Description = {
        "You have spoken to the Battle Instructor"
    },
    Summary = ""
},

还会有101-n个数据

我正在尝试将split与以下代码结合使用:

#!/usr/bin/perl

use strict;
use warnings;

my $data = '101#Introduction to the Professor#SG_FEEL#QUE_NOIMAGE#';

my @values = split('#', $data);

foreach my $val (@values) {
    print "$val\n";
}

exit 0;

和输出:

101
Introduction to the Professor
SG_FEEL
QUE_NOIMAGE

如何读取多行数据?还有如何排除一些数据,例如匹配新格式,我不需要SG_FEEL和QUE_NOIMAGE数据

2 个答案:

答案 0 :(得分:5)

Perl特殊变量$/设置“输入记录分隔符”,即Perl用于确定行结束位置的字符串。您可以将其设置为其他内容。

use v5.26;
use utf8;
use strict;
use warnings;

$/ = "\n\n";  # set the input record separator

while( <DATA> ) {
    chomp;
    say "$. ------\n", $_;
    }

__END__
101#Introduction to the Professor#SG_FEEL#QUE_NOIMAGE#
head up to the Great Hall and speak to the professor to check in for class.#
#

102#Looking for Instructors#SG_FEEL#QUE_NOIMAGE#
Look for the Battle Instructor.#
Talk to Battle Instructor#

103#Battle Instructor#SG_FEEL#QUE_NOIMAGE#
You have spoken to the Battle Instructor#
#

输出显示您每次调用<DATA>都读取了整个记录:

1 ------
101#Introduction to the Professor#SG_FEEL#QUE_NOIMAGE#
head up to the Great Hall and speak to the professor to check in for class.#
#
2 ------
102#Looking for Instructors#SG_FEEL#QUE_NOIMAGE#
Look for the Battle Instructor.#
Talk to Battle Instructor#
3 ------
103#Battle Instructor#SG_FEEL#QUE_NOIMAGE#
You have spoken to the Battle Instructor#
#

从那里可以根据需要解析该记录。

答案 1 :(得分:0)

读取多行内容很容易,请参见readline

Excel.run(function(context){
    return runWorkbook(context, context.workbook)
        .then(function(){ var cool = "all promises worked !" }
        .catch(function(error)) { var bad = "do not want to be here :(" });
}

function runWorkbook(context, workbook){
    const sheets = workbook.worksheets;
    sheets.load("$none");
    return context.sync().then(function(){
        let promise = new window.OfficeExtension.Promise(function(resolve, reject) { resolve(null); });
        sheets.items.forEach(function(ws) {
            promise = promise.then(function() {
                return makeWorkOnWorksheet(ws)
                    .then(context.sync())
                    .catch(function(error)){
                        // DO NOTHING BUT CAN NOT THROW ERROR OTHERWISE IT BREAKS THE NEXT APPENDED PROMISES
                    });
        }
        return promise;
    }
}

现在,您想遍历所有行,看看如何处理它们:

open my $fh, '<', $filename
    or die "Couldn't read '$filename': $!";
my @input = <$fh>;

my $linenumber; my %info; # We want to collect information while ($linenumber < $#input) { 开头的每一行都开始一个新项目:

nnn#

现在,将内容读入描述中,直到遇到空行:

    if( $input[ $linenumber ] =~ /^(\d+)#/ ) {
        my @data = split /#/, $input[ $linenumber ];
        $info{ number } = $data[0];
        $info{ Title } = $data[1];
        $linenumber++;
    };

现在,以 while ($input[$linenumber] !~ /^#$/) { $info{ Description } .= $input[$linenumber]; $linenumber++; }; $linenumber++; # skip the last "#" line 格式输出内容,并以练习的形式进行格式化。我已使用%info进行演示。您将需要将其更改为qq{}

qq()