我有下面的数据集,需要一些转置。我在脚本上苦苦挣扎。任何帮助,将不胜感激。所有列/值都是动态的
文件格式:
ID FieldName FieldValue
1 Rooms Required? Yes
1 Country of Meeting US
2 Rooms Required?
2 Country of Meeting
3 Rooms Required? Yes
3 Country of Meeting US
4 Rooms Required? No
4 Country of Meeting BL
需要输出:
ID Rooms Required? Country of Meeting
1 Yes US
2
3 Yes US
4 No BL
请帮助
答案 0 :(得分:0)
以下是使用join
(对于外壳使用bash
)的一般想法:
$ echo ID Rooms Country; \
join -j1 -o '0 1.4 2.5' -a1 -a2 -e- <(grep -F Rooms data.txt) <(grep -F Country data.txt)
ID Rooms Country
1 Yes US
2 - -
3 Yes US
4 No BL
使其适应您的需求。
答案 1 :(得分:0)
基于您的由制表符awk
分隔的字段的纯'\t'
解决方案如下:
awk 'BEGIN { FS = "\t"; PROCINFO["sorted_in"] = "@ind_num_asc" } { if ( $1 !~ /^[0-9]+$/ ) next; A[$1][$2] = $3; H[$2] } END { printf "ID"; for (h in H) printf "\t" h; for (i in A) { printf "\n\n" i; for (j in A[i]) printf "\t" A[i][j] } print "\n" }' filename
然后细分:
awk 'BEGIN {
FS = "\t" #Set Field Separator as the Tab
PROCINFO["sorted_in"] = "@ind_num_asc" #Set array order as numbers
}
{
if ( $1 !~ /^[0-9]+$/ ) #Skip all rows without numeric ID
next
A[$1][$2] = $3 #Store value in multi-dimensional array
H[$2] #Store header name
}
END {
printf "ID"
for (h in H) #Print all headers found
printf "\t" h
for (i in A) { #Print each record with corresponding values
printf "\n\n" i
for (j in A[i])
printf "\t" A[i][j]
}
print "\n"
}' filename
请让我知道是否需要进一步说明。它可以与您按任意顺序设置和的任意多个字段一起使用。如果记录没有全部相同的字段,则您的输出可能看起来参差不齐。