我有一个包含多行代码的文件:
{
"playerForm": [
{"result": "L", "name": "Dicky", "date": "2017-02-10", "results_id": 48},
{"result": "L", "name": "Dicky", "date": "2017-02-10", "results_id": 47},
{"result": "L", "name": "Dicky", "date": "2017-02-09", "results_id": 44},
{"result": "L", "name": "Dicky", "date": "2017-01-16", "results_id": 32},
{"result": "D", "name": "Dicky", "date": "2016-12-12", "results_id": 4},
{"result": "W", "name": "Dicky", "date": "2016-12-12", "results_id": 6},
{"result": "W", "name": "Dicky", "date": "2016-12-12", "results_id": 8}
]
}
我需要在字母重复之后(即在每个a,b,c之后)打破这些行,但保留原始名称(字段1):
name1 a1 b3 c6 a3 b4 c9
name2 a7 b8 c7 a9 b10 c13
name3 a12 b9 c8
name4 a4 b34 c19 a7 b2 c10 a3 b5 c67
我尝试了以下几点:
name1 a1 b3 c6
name1 a3 b4 c9
name2 a7 b8 c7
name2 a9 b10 c13
name3 a12 b9 c8
name4 a4 b34 c19
name4 a7 b2 c10
name4 a3 b5 c67
但是awk -F"\t" '{ for (i=2;i<=NF;i++) print $1"\t"$i }' file
合并了每个字段,有没有办法对它们进行分组?
谢谢。
答案 0 :(得分:0)
@ starter5:尝试:
awk 'BEGIN{V["a"];V["b"];V["c"]} /name/{R=$0;next} {Q=$0;gsub(/[[:digit:]]/,"",Q)} (Q in V){if(!W[Q]++){A++}} $0{if(A==1 && $0 && R){$0=R OFS $0};printf("%s %s",$0,(A==3?"\n":OFS));;if(A==3){A="";delete W}}' RS='[ +|\n]' Input_file
以下是非单一的内衬形式的解决方案。
awk 'BEGIN{
V["a"];
V["b"];
V["c"]
}
/name/{
R=$0;
next
}
{
Q=$0;
gsub(/[[:digit:]]/,"",Q)
}
(Q in V){
if(!W[Q]++){
A++
}
}
$0 {
if(A==1 && $0 && R){
$0=R OFS $0
};
printf("%s %s",$0,(A==3?"\n":OFS));;
if(A==3) {
A="";
delete W
}
}
' RS='[ +|\n]' Input_file
所以让我们说我们有跟随Input_file(我改变了最后一行)来测试a,b,c是不是按序列进行的,所以它不会断行,直到找到其中三个,看看它和那么请告诉我。
cat Input_file
name1 a1 b3 c6 a3 b4 c9
name2 a7 b8 c7 a9 b10 c13
name3 a12 b9 c8
name4 a4 b34 a19 a7 b2 c10 a3 b5 c67
输出如下。
name1 a1 b3 c6
name1 a3 b4 c9
name2 a7 b8 c7
name2 a9 b10 c13
name3 a12 b9 c8
name4 a4 b34 a19 a7 b2 c10
name4 a3 b5 c67
答案 1 :(得分:0)
我需要在字母重复之后(即每次之后)打破线条 a,b,c),但保留原始名称(字段1):
<强>输入强>
$ cat file
name1 a1 b3 c6 a3 b4 c9
name2 a7 b8 c7 a9 b10 c13
name3 a12 b9 c8
name4 a4 b34 c19 a7 b2 c10 a3 b5 c67
<强>输出强>
$ awk 'function _p(){print $1,s; s=""; split("",p)}{for(i=2; i<=NF; i++){ c=substr($i,1,1);if(c in p)_p(); s = (s?s OFS:"") $i; p[c] }_p()}' file
name1 a1 b3 c6
name1 a3 b4 c9
name2 a7 b8 c7
name2 a9 b10 c13
name3 a12 b9 c8
name4 a4 b34 c19
name4 a7 b2 c10
name4 a3 b5 c67
更好的可读版本
awk '
function _p()
{
print $1,s;
s="";
split("",p)
}
{
for(i=2; i<=NF; i++)
{
c=substr($i,1,1);
if(c in p)_p();
s = (s?s OFS:"") $i;
p[c]
}
_p()
}
' file
或强>
$ awk 'function _p(){print $1,s; s=p=""}{for(i=2; i<=NF; i++){ c=substr($i,1,1); if(c==p)_p(); s = (s?s OFS:"") $i; if(!p)p=c }_p()}' file
name1 a1 b3 c6
name1 a3 b4 c9
name2 a7 b8 c7
name2 a9 b10 c13
name3 a12 b9 c8
name4 a4 b34 c19
name4 a7 b2 c10
name4 a3 b5 c67
更好的可读版本
awk '
function _p()
{
print $1,s;
s=p=""
}
{
for(i=2; i<=NF; i++)
{
c=substr($i,1,1);
if(c==p)_p();
s = (s?s OFS:"") $i;
if(!p)p=c
}
_p()
}' file
答案 2 :(得分:0)
{ # for any record
printf $1 # print name
c=substr($2,1,1); # first letter of group
printf OFS $2 # first part of first group
for(i=3; i<=NF; i++) { # for all the rest fields
if(index($i,c) != 1) # if next group has not started
printf OFS $i # print this part on same line
else # otherwise
printf ORS $1 OFS $i # print name and this part on next line
} # done for all fields
printf ORS # move to next line
} # done for this record
如果某个字母在组中重复,则不起作用。例如,它不适用于 这可以像:a3 b5 a4 c6 a5 b6 a0 b9
的{{1}}组。{/ p>
a b a c