Question

我有一个具有以下模式的文件（请注意这是使用sed生成的文件， awk，grep等处理）。文件输入的部分如下。

filename1,
BASE=a/b/c
CONFIG=$BASE/d
propertiesfile1=$CONFIG/e.properties
EndOfFilefilename1

filename2,
BASE=f/g/h
CONFIG=$BASE/i
propertiesfile1=$CONFIG/j.properties
EndOfFilefilename2

filename3,
BASE=k/l/m
CONFIG=$BASE/n
propertiesfile1=$CONFIG/o.properties
EndOfFilefilename3

我希望输出像

filename1,a/b/c/d/e.properties,
filename2,f/g/h/i/j.properties,
filename3, k/l/m/n/o.properties,

我找不到sed或awk或grep的解决方案。所以我笨蛋。如果您了解使用这些unix实用程序或任何其他语言平台的解决方案，请告诉我。

此致

Suhaas

Answer 1

假设您生成了原始文件，因此可以安全地将其作为脚本执行：

sed -e 's/^.*,/FILE=&/'                   \
    -e 's/^.*=\$CONFIG/PROPFILE=$CONFIG/' \
    -e 's/^EndOfFile.*/echo $FILE $PROPFILE/' < yourInputFile | sh

这会将文件的每个部分转换为以下格式：

FILE=filename1,
BASE=a/b/c
CONFIG=$BASE/d
PROPFILE=$CONFIG/e.properties
echo $FILE $PROPFILE

...然后将其发送到shell进行处理。

逐行说明：

第1行：搜索以逗号结尾的行（文件名），并将FILE设置为名称。
第2行：搜索设置属性文件的行，并将变量重命名为PROPFILE 第3行：使用命令替换EndOfFile行以回显文件名和属性文件，然后将其传送到shell中。

Answer 2

这是structural regular expressions的一个很好的用例，已在其他地方实施为a python library。这是一篇描述how to emulate SREs in Perl的文章。

Answer 3

这是一个用于处理输入并生成所需内容的awk脚本：

BEGIN {
FS="="
state = 0;
base = "";
config = "";
prop = "";
filename = "";
dbg = 0;
}
/^BASE=/ {
if (dbg) {
    print "BASE";
    print $0;
}
if (state != 1) {
    print "Error base!";
    exit 1;
}
state++;
base = $2;
if (dbg > 1) printf ("BASE = %s\n", base);
}
/^CONFIG=/ {
if (dbg) {
    print "CONFIG";
    print $0;
}
if (state != 2) {
    print "Error config!";
    exit 1;
}
state++;
config = $2;
sub (/\$BASE/, base, config);
if (dbg > 1) printf ("CONFIG = %s\n", config);
}
/^propertiesfile1=/ {
if (dbg) {
    print "PROP";
    print $0;
}
if (state != 3) {
    print "Error pF!";
    exit 1;
}
state++;
prop = $2;
sub (/\$CONFIG/, config, prop);
}
/^EndOfFile/ {
if (dbg) {
    print "EOF";
    print $0;
}
if (state != 4) {
    print "Error EOF!";
    print state;
    exit 1;
}
state = 0;
printf ("%s%s,\n", filename, prop);
}
/,$/{
if (dbg) {
    print "FILENAME";
    print $0;
}
if (state != 0) {
    print "Error filename!";
    print state;
    exit 1;
}
state++;
filename = $1;
}

Answer 4

gawk的

gawk -vRS= 'BEGIN{FS="BASE[=]?|CONFIG|\n"}
{
 s=$1 
 for(i=1;i<=NF;i++){
    if($i~/\// ){ s=s $i }
 }
 print s
 s="" 
}' file

输出

$ more file
filename1,
BASE=a/b/c
CONFIG=$BASE/d
propertiesfile1=$CONFIG/e.properties
EndOfFilefilename1

filename2,
BASE=f/g/h
CONFIG=$BASE/i
propertiesfile1=$CONFIG/j.properties
EndOfFilefilename2

filename3,
BASE=k/l/m
CONFIG=$BASE/n
propertiesfile1=$CONFIG/o.properties
EndOfFilefilename3

$ ./shell.sh
filename1,a/b/c/d/e.properties
filename2,f/g/h/i/j.properties
filename3,k/l/m/n/o.properties

Answer 5

执行您想要的操作的perl脚本（注意这是未经测试的）

while (<>) {

  $base = $1 if (m/BASE=(.+)/);
  $config = $1 if (m/CONFIG=(.+)/);

  if (m/propertiesfile1=(.+)/) {

    $props = $1;
    $props =~ m/\$CONFIG/$config/;    
    $props =~ m/\$BASE/$base/;

    print $ARGV . ", " . $props . "\n";
  }
}

您将脚本的文件名作为参数。

Answer 6

多步但它有效！

    cat yourInputFile | egrep ',|\/' | \
    sed -e "s/^.*=//g" -e "s/\$.*\(\/.*\)/\1/g" | \
    awk '{if($0 ~  "properties") print $0; else printf $0}'

egrep抓住包含＆＃34;，＆＃34;或者＆＃34; /＆＃34;因此消除了最后一行：

BASE=a/b/c
CONFIG=$BASE/d
propertiesfile1=$CONFIG/e.properties

sed将输出减少为：

filename1,
a/b/c
/d
/e.properties

awk部分将该行重新组合为：

filename1,a/b/c/d/e.properties

使用unix linux实用程序的段落中的文本模式处理

6 个答案: