复杂的awk循环在CSV上

时间:2013-04-03 14:34:30

标签: linux bash sed awk

我有多个带有一种规则库的CSV,我需要解析并创建命令,而且我遇到了多个问题。

在我开始之前,这是一个如何布局的例子,以及它的外观:

$1 = Rule number
$3 = Source
$4 = Destination
$5 = Service
$6 = Action
$7 = Track
$10 = Comments

Security Policy: Blahblahblah,,,,,,,,,
12,,host_A,net-B,https,drop,Log,Any,Any,comments
13,,host_A,net-B,smtp,drop,Log,Any,Any,comments
14,,host_A,net-B,http,accept,Log,Any,Any,comments 
,,net-C,,,,,,,
,,net-D,,,,,,,
15,,host_A,net-B,http,accept,Log,Any,Any,comments
,,host_B,net-C,service_X,,,,,
,,host_C,net-D,service_y,,,,,
,,host_D,,,,,,,
,,host_E,,,,,,,

问题#1:需要在循环内调整第1列(规则编号)。我需要从中减去一个变量,使其等于正确的数字(需要移位)。例如,第一个规则#12需要在循环中成为#1。

我用它来创建我需要从每个连续行的原始值中减去的变量(取第一行,减一):

`awk -F, 'NR==2 {print $1 -1 }'

问题#2:我需要在Rule#的每个实例上迭代这个循环。 IE:每个规则“可以”拥有多个源/目标/服务,我需要能够使用正确的规则链接新对象。

还需要检查$ 1的错误,因为有些字段/规则需要跳过,以“禁用”或类似的方式开头。这似乎可以解决问题:

awk -F, '$1 ~ "^[0-9]*$" {print $1}

总的来说,我希望最终输出看起来如下:

(所有echo'd / awk打印等):

if new rule # is found in $1:
create rule security_rule
create action $rule_number $action
create comment $rule_number $comment
create source $rule_number $source <--- iterate as many times as required
create destination $rule_number $destination <--- iterate as many times as required
create service $rule_number $service <--- iterate as many times as required
create track $rule_number $track

等...

您可以提出任何帮助/建议。

谢谢,

编辑:一个更好的例子(规则1 = CSV中的规则12 - 这些仍然是粗略的打印语句,我可以在以后填写正确的打印值):

if new rule # is found in $1:
create rule security_rule
create action rule 1 drop
create comment rule 1 "This is a comment"
create source rule 1 host_A
create destination rule 1 net-B
create service rule 1 https
create track rule 1 Log

具有多个源/目标/服务的人只需添加额外的“创建源规则x”行,如下所示:

if new rule # is found in $1:
create rule security_rule
create action rule 3 accept
create comment rule 3 "This is a comment"
create source rule 3 host_A
create source rule 3 net-C
create source rule 3 net-D
create destination rule 3 net-B
create service rule 3 http
create track rule 3 Log

2 个答案:

答案 0 :(得分:1)

Awk可以做到这一点,但是它很笨拙。您基本上收集一个大字符串中的信息,然后在完成每个字符串后将其打印出来。 (只记得打印最后一张)

我省略了if new rule # is found in $1:位...因为我不完全理解它应该如何工作。如果你绝对需要在最后显示“轨道”线...只需花3美元,4美元和5美元重复7美元。

BEGIN{
    FS=",";recNum=0;curLine=""
}

$1 ~ /^Security Policy/ {next}

$1!="" {
    print curLine,"\n"
    recNum++;
    $1=recNum;
    curLine=sprintf("create rule security_rule\ncreate action rule %d %s\n",$1,$6);
    curLine=curLine sprintf("create comment rule %d \"%s\"\n",$1,$10);
    curLine=curLine sprintf("create track rule %d %s\n",$1,$7);
}
$1=="" {
    $1=recNum;
}

$3!=""{
    curLine=curLine sprintf("create source rule %d %s\n",$1,$3);
}
$4!=""{
    curLine=curLine sprintf("create destination rule %d %s\n",$1,$4);
}
$5!=""{
    curLine=curLine sprintf("create service rule %d %s\n",$1,$5);
}
END {print curLine}

对于您上面的输入,这给了我:

create rule security_rule
create action rule 1 drop
create comment rule 1 "comments"
create track rule 1 Log
create source rule 1 host_A
create destination rule 1 net-B
create service rule 1 https


create rule security_rule
create action rule 2 drop
create comment rule 2 "comments"
create track rule 2 Log
create source rule 2 host_A
create destination rule 2 net-B
create service rule 2 smtp


create rule security_rule
create action rule 3 accept
create comment rule 3 "comments"
create track rule 3 Log
create source rule 3 host_A
create destination rule 3 net-B
create service rule 3 http
create source rule 3 net-C
create source rule 3 net-D


create rule security_rule
create action rule 4 accept
create comment rule 4 "comments"
create track rule 4 Log
create source rule 4 host_A
create destination rule 4 net-B
create service rule 4 http
create source rule 4 host_B
create destination rule 4 net-C
create service rule 4 service_X
create source rule 4 host_C
create destination rule 4 net-D
create service rule 4 service_y
create source rule 4 host_D
create source rule 4 host_E

答案 1 :(得分:0)

我对这个问题并不完全清楚,但正如@Charles Duffy所提到的,为什么不使用原生bash,你能给出一个示例文件及其输出,我试图打破你问题的要求,但得到了丢失。无论如何,下面是一个小例子,你可以尝试修改以满足你的要求,同样可以通过awk做更优雅(害怕我在awk中没那么多),我强行设置列一到数组索引,将第3个值保留为“hello”,如果为空则保留旧值。

[bash]$ cat example;echo "##################################################"; ./tmp.sh < example ;echo "##################################################"; cat tmp.sh
12,,host_A,net-B,https,drop,Log,Any,Any,comments
13,,host_A,net-B,smtp,drop,Log,Any,Any,comments
14,,host_A,net-B,http,accept,Log,Any,Any,comments
,,net-C,,,,,,,
,,net-D,,,,,,,
15,,host_A,net-B,http,accept,Log,Any,Any,comments
,,host_B,net-C,service_X,,,,,
,,host_C,net-D,service_y,,,,,
,,host_D,,,,,,,
,,host_E,,,,,,,
##################################################
0 host_A hello https drop Log Any Any comments
1 host_A hello smtp drop Log Any Any comments
2 host_A hello http accept Log Any Any comments
3 net-C hello http accept Log Any Any comments
4 net-D hello http accept Log Any Any comments
5 host_A hello http accept Log Any Any comments
6 host_B hello service_X accept Log Any Any comments
7 host_C hello service_y accept Log Any Any comments
8 host_D hello service_y accept Log Any Any comments
9 host_E hello service_y accept Log Any Any comments
##################################################
#!/bin/bash
oldarr=();
oldarr[3]="hello"
index=0
while IFS=',' read -ra newarray
#do any rule which is iteration over data
do
  for (( i = 0; i < ${#newarray[@]}; i++))
  do
    if [ "${newarray[$i]}" ]
    then
#put any exceptional case
      if [ "$i" != "3" ]
      then
      oldarr[$i]=${newarray[$i]}
      fi
    fi
  done
#put anything which is independent of iteration
  oldarr[0]=$index
  ((index++))
  echo ${oldarr[*]}
done