这是我生成的输入,它在不同时间显示Jany和Marco的课程版本。
on 10:00 the course of jany 1 is :
course:theory:nothing
course:applicaton:onehour
on 10:00 the course of jany 2 is :
course:theory:math
course:applicaton:twohour
on 10:00 the course of Marco 1 is :
course:theory:geo
course:applicaton:halfhour
on 10:00 the course of Marco 2 is :
course:theory:history
course:applicaton:nothing
on 14:00 the course of jany 1 is :
course:theory:nothing
course:applicaton:twohours
on 14:00 the course of jany 2 is :
course:theory:music
course:applicaton:twohours
on 14:00 the course of Marco 1 is :
course:theory:programmation
course:applicaton:onehours
on 14:00 the course of Marco 2 is :
course:theory:philosophy
course:applicaton:nothing
使用awk命令我成功地对它进行了排序:
awk -F '[\ :]' '/the course of/{h=$2;m=$3} /theory/{print " "h":"m" theory:"$3}' f.txt
awk -F '[\ :]' '/the course of/{h=$2;m=$3} /application/{print " "h":"m" application :"$3}' f.txt
10:00 theory:nothing
14:00 theory:nothing
10:00 application:onehour
14:00 application:twohours
现在我想通过添加名称(jany,Marco)和版本(1或2)来改进过滤器,如下所示。
Jany 1,10:00,14:00
theory,nothing,nothing
application,onehour,twohour
Jany 2,10:00,14:00
theory,math,music
application,twohour,twohour
Marco 1,10:00,14:00
theory,geo,programmation
application,halfhour,onehour
Marco 2,10:00,14:00
theory,history,philosoohy
application,nothing,nothing
我被困在如何提取姓名,号码'并在排序和过滤的表格中获取引用其课程的信息。
答案 0 :(得分:2)
使用GNU awk实现真正的多维数组和sorted_in:
$ cat tst.awk
BEGIN{ RS=""; FS="[[:space:]:]+" }
{
for (i=11; i<=NF; i+=3) {
sched[$7" "$8][$2":"$3][$i] = $(i+1)
courses[$i]
}
}
END {
PROCINFO["sorted_in"] = "@ind_str_asc"
for (name in sched) {
printf "%s", name
for (time in sched[name]) {
printf ",%s", time
}
print ""
for (course in courses) {
printf "%s", course
for (time in sched[name]) {
printf ",%s", sched[name][time][course]
}
print ""
}
print ""
}
}
$ gawk -f tst.awk file
Marco 1,10:00,14:00
applicaton,halfhour,onehours
theory,geo,programmation
Marco 2,10:00,14:00
applicaton,nothing,nothing
theory,history,philosophy
jany 1,10:00,14:00
applicaton,onehour,twohours
theory,nothing,nothing
jany 2,10:00,14:00
applicaton,twohour,twohours
theory,math,music
它并不完全产生你发布的预期输出,但我认为这是因为你发布的预期输出是错误的(例如,检查jany 1应用程序的输出14:00与输入相比 - 输入为twohours
之类我的脚本生成,但你说预期的输出是halfhour
)。
答案 1 :(得分:1)
试试这个:
BEGIN {
# set records separated by empty lines
RS=""
# set fields separated by newline, each record has 3 fields
FS="\n"
}
{
# remove undesired parts of every first line of a record
sub("the course of ", "", $1)
sub(" is :", "", $1)
sub("on ", "", $1)
# now store the rest in time and course
time=$1
course=$1
# remove time from string to extract the course title
sub("^[^ ]* ", "", course)
# remove course title to retrieve time from string
sub(course, "", time)
# get theory info from second line per record
sub("course:theory:", "", $2)
# get application info from third line
sub("course:applicaton:", "", $3)
# if new course
if (! (course in header)) {
# save header information (first words of each line in output)
header[course] = course
theory[course] = "theory"
app[course] = "application"
}
# append the relevant info to the output strings
header[course] = header[course] "," time
theory[course] = theory[course] "," $2
app[course] = app[course] "," $3
}
END {
# now for each course found
for (key in header) {
# print the strings constructed
print header[key]
print theory[key]
print app[key]
print ""
}
我希望这些评论是自我解释的,如果你对这个剧本有疑问,一定要问他们。