正则表达式根据条件解析多个文本

时间:2019-02-27 05:37:49

标签: regex perl

下面提到的字符串

$string1 = "Job Description : Advt.No.01/2019-FCI Category IIIJunior Engineer /Assistant recruitment in Food Corporation of India (FCI) FCI recruiting 4103 Vacancies f...";
$string2 = "Job Description : Assistant Professor (Management) job recruitment in Saurashtra University on contract basis No. of Post : 01 Eligibility :  As per ...";

上面的字符串可能一次只能输入一个

  1. 情况:我不需要解析职位或职位空缺的数值。假设我找到空缺关键字,那么我需要获得接近空缺的数值。
  2. 情况:假设现在在字符串中未找到空缺,那么我再次需要对邮政号进行搜索并找到接近它的数值。

那么,我该如何编写正则表达式。我有两个正则表达式

$string2 =~ m/(?:pos.*?)\s+(\d+)\s+/ig;
print $1;

$string1 =~ m/(\d+)\s+(?:vac.*?)/ig;
print $2;

在两种情况下,我都需要编写一个正则表达式。那么,我该怎么办?

3 个答案:

答案 0 :(得分:1)

此正则表达式应该起作用:

/(\d+ vacancies|no\. of post \: \d+)/ig

答案 1 :(得分:0)

虽然有可能将其压缩为一个不可读的正则表达式集合,但分而治之可能是更好的策略。我的解决方案提出了两种可能的选择。

您当然可以根据实际输入放松或收紧正则表达式。

#!/usr/bin/perl
use warnings;
use strict;

my $string1 = "Job Description : Advt.No.01/2019-FCI Category IIIJunior Engineer /Assistant recruitment in Food Corporation of India (FCI) FCI recruiting 4103 Vacancies f...";
my $string2 = "Job Description : Assistant Professor (Management) job recruitment in Saurashtra University on contract basis No. of Post : 01 Eligibility :  As per ...";

# alternative 1
foreach my $string ($string1, $string2) {
    $string =~ /(\d+)\s+vacancies/i    ||
    $string =~ /no\. of post : (\d+)/i
        or die "no match for '${string}'\n";
    print "MATCH: $1\n";
}

# alternative 2
foreach my $string ($string1, $string2) {
    my($submatch) = $string =~ /(\d+\s+vacancies|no\. of post : \d+)/i
        or die "no match for '${string}'\n";
    my($number) = $submatch =~ /(\d+)/;
    print "MATCH: ${number}\n";
}


exit 0;

测试输出:

$ perl dummy.pl
MATCH: 4103
MATCH: 01
MATCH: 4103
MATCH: 01

答案 2 :(得分:0)

您可以这样使用,我将您的正则表达式与|结合在一起,并使用(?| )来获取$1中的所有匹配项

$string=~m/(?|(?:pos.*?)\s+(\d+)\s+|(\d+)\s+(?:vac.*?))/ig;
print $1,"\n";