如何移动数字并匹配linux中的列和行?

时间:2011-11-22 10:05:47

标签: linux awk

大家好我有一个问题??我想在另一个文件中移动列并与行和列匹配并放入准确的位置。

输入file1是:

COURSE NAME: Java
CREDITS: 4
200345    88
300126    78
287136    68
200138    71
COURSE NAME: Operating System
CREDITS: 4
287136    86
200138    72
200345    77
300056    78

输入file2是:

STUDENT ID        Java        Operating System      GPA
200138
200345
287136
300056
300126

需要这样的输出:

STUDENT ID        Java        Operating System      GPA
200138             71                 72
200345             88                 77
287136             68                 86
300056             -                  78
300126             78                 -

我正在试用这段代码:

awk 'NR==FNR{a[$1]=$2;next} {print $0 FS a [$1];next}' file1 file2

它的输出将是这样的:

STUDENT ID       JAVA       Operating Systems       GPA
200138 72
200345 77
287136 86
300056 78
300126 78

我试了很多:( 你能帮我吗?

2 个答案:

答案 0 :(得分:2)

嗯,你可以用awk做到这一点,但这不是微不足道的。像这样:

# We will save every course name in an array for the report, but first remove the 
# the unwanted space from it's name. This line only works on the COURSE NAME lines
/^COURSE NAME/ {cn=gensub(" ","_","g",gensub(".*: ","","g",$0)); crss[cn]+=1 }

# On lines which starts with a number (student id), we are saving the student ids 
# in an array (stdnts) and the students score with a "semi" multideminsional array/hash
# where the indecies are "student_id-course_name" 
/^[0-9]/ {stdnts[$1]+=1 ; v[$1 "-" cn]=$2}

# after the above processing (e.g. the last line of the input file)
END {
      # print the header, it's dynamic and uses the course names it saved earlier
      printf("%-20s","STUDENT ID");
      for (e in crss) {printf("%-20s",e)}
      printf("%-20s\n","GPA")

      # print the report
      for (s in stdnts)
          # we need to print every student id
          { printf("%-20s",s)
            for (cs in crss) {
                # then check if she had any score for every score, and print it
                if (v[s "-" cs] > 0) {
                    printf("%-20s",v[s "-" cs])
                }
                else {
                    printf("%-20s","-")
            }
          }
        printf("%-20s\n"," ")
      }
  }

请参阅此处的操作:https://ideone.com/AgaS8

注意

  1. 脚本未优化;
  2. 用无空格的
  3. 取代原来的课程名称
  4. 学生表的输出未按学生ID排序
  5. 它只需要第一个文件作为输入!将上述内容放在report.awk之类的文件中,然后执行awk -f report.awk INPUT_FILE
  6. HTH

答案 1 :(得分:1)

我写了一个快速而肮脏的解决方案,适合您给出的示例输入。

命令:

 sed '/CREDIT/d' file1|awk 'FNR==NR{ if($0~/Java/){j++;o=0;next;}
        if($0~/Operating/){o++;j=0;next;}
        if(j){java[$1]=$2}
        if(o){os[$1]=$2}
}NR>FNR{OFS=" ";
        if(FNR==1){sub(/ /,"");print;}
        else{$2=" "
        $3=($1 in java)?java[$1]:"-";
        $4=($1 in os)?os[$1]:"-";
        print $0;
        }

}' - file2|column -t|sed -e 's/ID/ ID/' -e '2,${s/ /    /}'

在我的控制台上测试:

kent$  sed '/CREDIT/d' file1|awk 'FNR==NR{ if($0~/Java/){j++;o=0;next;}
        if($0~/Operating/){o++;j=0;next;}
        if(j){java[$1]=$2}
        if(o){os[$1]=$2}
}NR>FNR{OFS=" ";
        if(FNR==1){sub(/ /,"");print;}
        else{$2=" "
        $3=($1 in java)?java[$1]:"-";
        $4=($1 in os)?os[$1]:"-";
        print $0;
        }

}' - file2|column -t|sed -e 's/ID/ ID/' -e '2,${s/ /    /}'
STUDENT ID  Java  Operating  System  GPA
200138        71    72
200345        88    77
287136        68    86
300056        -     78
300126        78    -

实际上逻辑部分相对容易,很多代码只是用于在您的示例中以相同的格式输出。