AWK:读取文件“ x”,并将其值与文件“ y”的第1列和第2列的值进行比较

时间:2019-08-05 04:42:53

标签: shell awk scripting

我正在尝试读取文件并操纵其列的值。对于文件X中的特定行,如果列6设置为2,则我将其替换为“ REVERSE-CHECK”,还要检查其第二列(文件X)值是否与列2(文件Y)匹配,并且第3列(文件X)的值与第1列(文件Y)匹配,然后将文件X的第7列值更改为“ ACCEPTED”,否则将其标记为“ NON ACCEPTABLE”。

文件X:

2019-08-01 00:00:04,00000011111,0000002221,111111000000000,2,2,0
2019-08-01 00:00:08,00000011112,0000002222,211111000000000,2,12,0
2019-08-01 00:00:20,00000011113,0000002223,311111000000000,2,12,0
2019-08-01 00:00:04,00000011114,0000002224,411111000000000,2,2,0
2019-08-01 00:00:08,00000011115,0000002225,511111000000000,2,2,0
2019-08-01 00:00:20,00000011116,0000002226,611111000000000,2,8,0

文件Y:

0000002221,00000011111
0000002226,00000011116
0000002223,00000011114

预期输出:

2019-08-01 00:00:04,00000011111,0000002221,111111000000000,INTERESTING,REVERSE-CHECK,ACCEPTABLE
2019-08-01 00:00:08,00000011112,0000002222,211111000000000,INTERESTING,SIMPLE-CHECK,NON-ACCEPTABLE
2019-08-01 00:00:20,00000011113,0000002223,311111000000000,INTERESTING,SIMPLE-CHECK,NON-ACCEPTABLE
2019-08-01 00:00:04,00000011114,0000002224,411111000000000,INTERESTING,REVERSE-CHECK,NON-ACCEPTABLE
2019-08-01 00:00:08,00000011115,0000002225,511111000000000,INTERESTING,REVERSE-CHECK,NON-ACCEPTABLE
2019-08-01 00:00:20,00000011116,0000002226,611111000000000,INTERESTING,BASIC-CHECK,ACCEPTABLE

代码块1:这有助于我轻松地操纵$ 5和$ 6列的值。

awk -F, '{    
            if ( $5 == "1" )
                    $5 = "INTERESTING"
           else if ( $5 == "2" )
                $5 = "IMPORTANT";
        else
                $5 = "UNKNOWN";

        if ( $6 == "2" )
                $6="REVERSE-CHECK";
        else if ( $6 == "12" )
                $6="SIMPLE-CHECK";
        else if ( $6 == "8" )
                $6="BASIC-CHECK";
        else
                $6="UNHANDLED";
print   }' OFS=, $exeDir/FileX.log > /home/standardOutput.log

代码块2:当我尝试通过嵌套检查操作第七列的值时。根本没用。

awk '
BEGIN { FS = OFS = ","
}
FNR == NR {
        i[$1]=$1
        j[$1]=$2
        next
}
{       if($3 in i){
              if ($2 in j){
                   $7 = "ACCEPTABLE";
              }
        }
        else{
                 $7 = "NOT ACCEPTABLE";
        }
}
1' FileY.log FileX.log

我很难合并这些代码。请帮忙。

1 个答案:

答案 0 :(得分:3)

您发布的预期输出与您要执行的操作的描述不匹配,如果正确或不正确,则为idk,但这符合我认为的描述:

$ cat tst.awk
BEGIN {
    FS = OFS = ","
}
NR==FNR {
    map[$2] = $1
    next
}
{
    $6 = ( $6 == 2 ? "REVERSE-CHECK" : $6 )
    $7 = ( ($2 in map) && ($3 == map[$2]) ? "ACCEPTED" : "NON ACCEPTABLE" )
    print
}

$ awk -f tst.awk fileY fileX
2019-08-01 00:00:04,00000011111,0000002221,111111000000000,2,REVERSE-CHECK,ACCEPTED
2019-08-01 00:00:08,00000011112,0000002222,211111000000000,2,12,NON ACCEPTABLE
2019-08-01 00:00:20,00000011113,0000002223,311111000000000,2,12,NON ACCEPTABLE
2019-08-01 00:00:04,00000011114,0000002224,411111000000000,2,REVERSE-CHECK,NON ACCEPTABLE
2019-08-01 00:00:08,00000011115,0000002225,511111000000000,2,REVERSE-CHECK,NON ACCEPTABLE
2019-08-01 00:00:20,00000011116,0000002226,611111000000000,2,8,ACCEPTED

添加对您发布的代码的解释会产生您所发布的预期输出(尽管与fileX中2个字段的关系在您的问题中模棱两可,所以我猜测您对map[]的真正需求) :

$ cat tst.awk
BEGIN {
    FS = OFS = ","
}
NR==FNR {
    map[$2] = $1
    next
}
{
    if      ( $5 ==  1 ) $5 = "INTERESTING"
    else if ( $5 ==  2 ) $5 = "IMPORTANT"
    else                 $5 = "UNKNOWN"

    if      ( $6 ==  2 ) $6 = "REVERSE-CHECK"
    else if ( $6 == 12 ) $6 = "SIMPLE-CHECK"
    else if ( $6 ==  8 ) $6 = "BASIC-CHECK"
    else                 $6 = "UNHANDLED"

    $7 = ( ($2 in map) && ($3 == map[$2]) ? "ACCEPTED" : "NON ACCEPTABLE" )

    print
}

$ awk -f tst.awk fileY fileX
2019-08-01 00:00:04,00000011111,0000002221,111111000000000,IMPORTANT,REVERSE-CHECK,ACCEPTED
2019-08-01 00:00:08,00000011112,0000002222,211111000000000,IMPORTANT,SIMPLE-CHECK,NON ACCEPTABLE
2019-08-01 00:00:20,00000011113,0000002223,311111000000000,IMPORTANT,SIMPLE-CHECK,NON ACCEPTABLE
2019-08-01 00:00:04,00000011114,0000002224,411111000000000,IMPORTANT,REVERSE-CHECK,NON ACCEPTABLE
2019-08-01 00:00:08,00000011115,0000002225,511111000000000,IMPORTANT,REVERSE-CHECK,NON ACCEPTABLE
2019-08-01 00:00:20,00000011116,0000002226,611111000000000,IMPORTANT,BASIC-CHECK,ACCEPTED