Question

给定一个测试文件settings.py，如下所示：

# Django settings for x project.
DEBUG = True
TEMPLATE_DEBUG = DEBUG
ADMINS = (
    # ('Your Name', 'your_email@example.com'),
)
MANAGERS = ADMINS
DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.', # Add 'postgresql_psycopg2', 'postgresql', 'mysql', 'sqlite3' or 'oracle'.
        'NAME': '',                      # Or path to database file if using sqlite3.
        'USER': '',                      # Not used with sqlite3.
        'PASSWORD': '',                  # Not used with sqlite3.
        'HOST': '',                      # Set to empty string for localhost. Not used with sqlite3.
        'PORT': '',                      # Set to empty string for default. Not used with sqlite3.
    }
}
# Hosts/domain names that are valid for this site; required if DEBUG is False
# See https://docs.djangoproject.com/en/1.3/ref/settings/#allowed-hosts
ALLOWED_HOSTS = []

我想以编程方式（shell脚本）替换line：

之间的部分

DATABASES = {

和

包含变量k中的一些文字：

declare -r k='foo bar baz'

我是perl初学者，但我编造了这个：

perl -ne 'if(!$f && /DATABASES/){$f=1} if(!$f){print} if($f && /^}$/){$f=0}' < settings.py

这与我通常的sed / awk小黑客有所不同：

# e.g.
sed '/DATABASES/,/^}$/ d' < settings.py

我想改进我的perl单行！

如何在全能的sed中做perl如此美妙的事情？

绝对最好的方法是什么：

观看stdin传递并将其复制到stdout
检测哨兵“停止打印”行并停止复制
在遇到第二个哨兵线时重新启用stdin-＆gt; stdout的传递

我已经省略了 replacement 部分的任务，希望也能得到一些帮助。

Answer 1

无法想象你为什么要使用perl进行简单的文本操作，因为它是awk的设计目标，并且像所有优秀的UNIX工具一样，awk做了一件事并做得很好。

使用GNU awk：

$ k="<<<< foo >>>>"
$ gawk -v k="$k" -v RS='\0' '{sub(/DATABASES = {.*\n}/,k)}1' file
# Django settings for x project.
DEBUG = True
TEMPLATE_DEBUG = DEBUG
ADMINS = (
    # ('Your Name', 'your_email@example.com'),
)
MANAGERS = ADMINS
<<<< foo >>>>
# Hosts/domain names that are valid for this site; required if DEBUG is False
# See https://docs.djangoproject.com/en/1.3/ref/settings/#allowed-hosts
ALLOWED_HOSTS = []

说明：

gawk
-v k="$k"     = set the awk variable k to the value of the shell variable k
-v RS='\0'    = set the Record Separator to the NULL string so gawk reads the whole file
'
{sub(/DATABASES = {.*\n}/,k)}     = replace the text between "DATABASES = {" and "}" at the start of a line inclusive with the contents of the awk variable k.
1     = set a true condition which invokes the default action of printing the current record (the whole file in this case)
' file

如果由于内存限制而无法一次读取整个文件，或者您只是喜欢这种风格或没有GNU awk，请将脚本修改为（未经测试）：

$ awk -v k="$k" '
    /DATABASES = {/ { skip=1 }
    skip && /^}/    { skip=0; $0=k }
    !skip
  ' file

希望它的作用是显而易见的。请注意，删除RS ='\ 0'的设置意味着该脚本不再特定于gawk。

如果你需要保留分界线，那也只是一个调整：

$ awk -v k="$k" '
    skip && /^}/    { skip=0; print k }
    !skip
    /DATABASES = {/ { skip=1 }
  ' file
# Django settings for x project.
DEBUG = True
TEMPLATE_DEBUG = DEBUG
ADMINS = (
    # ('Your Name', 'your_email@example.com'),
)
MANAGERS = ADMINS
DATABASES = {
<<<< foo >>>>
}
# Hosts/domain names that are valid for this site; required if DEBUG is False
# See https://docs.djangoproject.com/en/1.3/ref/settings/#allowed-hosts
ALLOWED_HOSTS = []

Answer 2

要删除DATABASES和}之间的部分，您可以使用：

perl -ne 'print unless (/DATABASES/../^}$/)' settings.py

对于替换，这样的事情：

$ export VAR="foo bar baz"
$ perl -ne 'print $ENV{VAR},"\n" if /DATABASES/; print unless /DATABASES/../^}$/' settings.py

Answer 3

我想我会告诉你如何将awk脚本转换为Perl脚本。

首先，我开始Ed Morton's awk version并通过a2p发送。

$ a2p
/DATABASES = {/ { skip=1 }
skip && /^}/    { skip=0; $0=k }
!skip
^d

请注意，^d代表按 Ctrl + d 。

#!/opt/perl-5.14.1/bin/perl
eval 'exec /opt/perl-5.14.1/bin/perl -S $0 ${1+"$@"}'
    if $running_under_some_shell;
            # this emulates #! processing on NIH machines.
            # (remove #! line above if indigestible)

eval '$'.$1.'$2;' while $ARGV[0] =~ /^([A-Za-z_0-9]+=)(.*)/ && shift;
            # process any FOO=bar switches

while (<>) {
    chomp;  # strip record separator
    if (/DATABASES = {/) {
    $skip = 1;
    }
    if ($skip && /^}/) {
    $skip = 0;
    $_ = $k;
    }
    print $_ if !$skip;
}

我们可以抛出eval 'exec ...行。我怀疑你是否会需要它。

由于我们只需要处理k="$k"，因此eval '$'.$1.'$2;' ...也可以被抛弃。我们只需将$k设置为$ENV{k}或将后者替换为后者。（请注意，为了实现此目的，您必须致电export k。您也可以通过env k="$k" perl test.pl来调用它。）

由于该行获得chomped，我们需要将print $_ if !$skip;替换为print $_, "\n" if !$skip;或将$\设置为"\n"。我想我们可以在不致电chomp的情况下离开。

另外，为防止难以发现错误，我将在开头添加use strict;和use warnings;。

#!/usr/bin/env perl
use strict;
use warnings;

my $skip; # prevents printing when true
while (<>) {
  if (/DATABASES = {/) {
    $skip = 1;
  }
  if ($skip && /^}/) {
    $skip = 0;
    $_ = $ENV{k}."\n";
  }
  print $_ if !$skip;
}

我认为我们可以在这里混合sed'主义。（...）

#!/usr/bin/env perl
use strict;
use warnings;

while (<>) {
  if( my $r = /DATABASES = {/ ... /^}/ ){

    if( $r == 1 ){ # first time it matches
      print $ENV{k}, "\n";
    }

    next; # don't print
  }

  print;
}

唯一的问题是，我认为OP希望替换 DATABASES = {和}之间的文字。所以我们必须添加代码以允许打印这两行。

#!/usr/bin/env perl use strict; use warnings; while (<>) { if( my $r = /DATABASES = {/ ... /^}/ ){ if( $r == 1 ){ # append the replacement to the first line $_ .= $ENV{k}."\n"; }elsif( $r !~ /E/ ){ # rest of the matches, except the last one next; } } print; }

你知道，我真的不喜欢将替换文本放在环境变量中。如何将其放在__DATA__部分。

use strict; use warnings; my $replacement = do{ local $/; <DATA> }; # slurp close DATA; while (<>) { if( my $r = /DATABASES = {/ .. /^}/ ){ if( $r == 1 ){ $_ .= $replacement; }elsif( $r !~ /E/ ){ next } } print; } __DATA__ <<< FOO >>>

省略由哨兵线分隔的文本文件的一部分

3 个答案: