Perl在使用之前对变量进行文本处理

时间:2012-07-20 07:25:48

标签: perl

我编写了一个perl脚本,它将输出一个包含类似条目的列表,如下所示:

$var = ' whatever'

$ var包含:单引号,空格,单词 ,单引号

实际上,这是哈希的关键,我想拉取相同的值。但是由于单引号和下面的空格,我无法提取哈希键值。

所以,我想删除$ var,如下所示:

$var = whatever

意思是删除单引号,空格和尾随单引号。

这样我就可以使用$ var作为哈希键来提取相应的值。

你可以指导我使用Perl oneliner吗?

thnaks。

3 个答案:

答案 0 :(得分:3)

以下是几种方法,但要注意 - 修改哈希中的键可能会产生不需要的结果,例如:

use strict;
use warnings;
use Data::Dumper;

my $src = {
    "a a"       => 1,
    " a a "     => 2,
    "' a a '"   => 3,
};
print "src: ", Dumper($src);
my $trg;

@$trg{ map { s/^[\s']*(.*?)[\s']*$/$1/; $_ } keys %$src } = values %$src;
print "copy: ", Dumper($trg); 

将产生:

src: $VAR1 = {
          ' a a ' => 2,
          '\' a a \'' => 3,
          'a a' => 1
        };
copy: $VAR1 = {
          'a a' => 1
        };

任何正则表达式都可以用YAPE :: Regex :: Explain模块解释。 (来自CPAN)。对于上述正则表达式:

use YAPE::Regex::Explain;
print YAPE::Regex::Explain->new( qr(^[\s']*(.*?)[\s']*$) )->explain;

将产生:

正则表达式:

(?-imsx:^[\s']*(.*?)[\s']*$)

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  ^                        the beginning of the string
----------------------------------------------------------------------
  [\s']*                   any character of: whitespace (\n, \r, \t,
                           \f, and " "), ''' (0 or more times
                           (matching the most amount possible))
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    .*?                      any character except \n (0 or more times
                             (matching the least amount possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  [\s']*                   any character of: whitespace (\n, \r, \t,
                           \f, and " "), ''' (0 or more times
                           (matching the most amount possible))
----------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

简而言之:s/^[\s']*(.*?)[\s']*$/$1/;意思是:

  • 在字符串的开头匹配空格或撇号可以多次,
  • 然后匹配任何内容
  • 尽可能多地匹配字符串空格或撇号的末尾
  • 并保留唯一的“任何”部分

答案 1 :(得分:2)

#!/usr/bin/perl
$string = "' my string'";
print $string . "\n";
$string =~ s/'//g;
$string =~ s/^ //g;
print $string;

<强>输出

' my string'
my string

答案 2 :(得分:1)

$var =~ tr/ '//d;

请参阅:tr operator

或者,通过正则表达式

$var =~ s/(?:^['\s]+)|'//g;

后者会将空格保留在单词的中间,前者会删除所有空格和单引号。

简短测试:

...
$var = q{' what ever'};
$var =~ s/
         (?:     # find the following group
           ^        # at string begin, followed by      
           ['\s]+   # space or single quote, one or more
         )       # close group
         |       # OR
         '       # single quotes in the while string 
         //gx ;  # replace by nothing, use formatted regex (x)
print "|$var|\n";
...

打印:

|what ever|

正如所料。