以下是我需要更改的一些值:
如果第一列是2 => 1
如果第一列是8 => 2
如果第一列是16 => 3
CHR SNP BP A1 TEST NMISS BETA STAT P
2 rs10173732 31404 A ADD 2607 -0.02162 -1.552 0.1207
2 rs10173732 31404 A COV1 2607 0.2659 24.15 1.849e-116
2 rs11684864 2547285 G ADD 2596 -0.009581 -0.6387 0.5231
2 rs11684864 2547285 G COV1 2596 0.2672 24.18 1.212e-116
2 rs11684864 2547285 G COV2 2596 0.004941 9.564 2.548e-21
8 rs3826201 88651817 T COV3 2576 -0.0186 -15.7 4.335e-53
16 rs8047319 88684276 C ADD 2538 0.01115 1.271 0.204
16 rs8047319 88684276 C COV1 2538 0.2632 23.73 1.402e-112
16 rs8047319 88684276 C COV2 2538 0.005039 9.715 6.276e-22
16 rs8047319 88684276 C COV3 2538 -0.01891 -15.9 2.583e-54
然而这个命令不方便,因为它改变了缩进,而且8的行似乎并不欣赏:
awk '{ if ( $1 == 2 ) { $1 = 1 } else if ( $1 == 8 ) { $1 == 2 } else if ( $1 == 16 ) { $1 = 3 }; print}' TEST > TESTnew
输出:
CHR SNP BP A1 TEST NMISS BETA STAT P
1 rs10173732 31404 A ADD 2607 -0.02162 -1.552 0.1207
1 rs10173732 31404 A COV1 2607 0.2659 24.15 1.849e-116
1 rs11684864 2547285 G ADD 2596 -0.009581 -0.6387 0.5231
1 rs11684864 2547285 G COV1 2596 0.2672 24.18 1.212e-116
1 rs11684864 2547285 G COV2 2596 0.004941 9.564 2.548e-21
8 rs3826201 88651817 T COV3 2576 -0.0186 -15.7 4.335e-53
3 rs8047319 88684276 C ADD 2538 0.01115 1.271 0.204
3 rs8047319 88684276 C COV1 2538 0.2632 23.73 1.402e-112
3 rs8047319 88684276 C COV2 2538 0.005039 9.715 6.276e-22
3 rs8047319 88684276 C COV3 2538 -0.01891 -15.9 2.583e-54
你如何才能更通用(即使某个特定行的缩进不同,并且不会改变原始文件中的缩进)?
答案 0 :(得分:2)
使用GNU awk为第3个arg匹配():
$ awk 'BEGIN { m[2]=1; m[8]=2; m[16]=3 }
$1 in m { match($0,/(\s*\S+)(.*)/,a); $0=sprintf("%*s",length(a[1]),m[$1]) a[2] }
1' file
CHR SNP BP A1 TEST NMISS BETA STAT P
1 rs10173732 31404 A ADD 2607 -0.02162 -1.552 0.1207
1 rs10173732 31404 A COV1 2607 0.2659 24.15 1.849e-116
1 rs11684864 2547285 G ADD 2596 -0.009581 -0.6387 0.5231
1 rs11684864 2547285 G COV1 2596 0.2672 24.18 1.212e-116
1 rs11684864 2547285 G COV2 2596 0.004941 9.564 2.548e-21
2 rs3826201 88651817 T COV3 2576 -0.0186 -15.7 4.335e-53
3 rs8047319 88684276 C ADD 2538 0.01115 1.271 0.204
3 rs8047319 88684276 C COV1 2538 0.2632 23.73 1.402e-112
3 rs8047319 88684276 C COV2 2538 0.005039 9.715 6.276e-22
3 rs8047319 88684276 C COV3 2538 -0.01891 -15.9 2.583e-54
如果1,2和3值只是一个增量索引:
$ awk 'BEGIN{split("2 8 16",t); for (i in t) m[t[i]]=i} $1 in m{match($0,/(\s*\S+)(.*)/,a); $0=sprintf("%*s",length(a[1]),m[$1]) a[2]} 1' file
CHR SNP BP A1 TEST NMISS BETA STAT P
1 rs10173732 31404 A ADD 2607 -0.02162 -1.552 0.1207
1 rs10173732 31404 A COV1 2607 0.2659 24.15 1.849e-116
1 rs11684864 2547285 G ADD 2596 -0.009581 -0.6387 0.5231
1 rs11684864 2547285 G COV1 2596 0.2672 24.18 1.212e-116
1 rs11684864 2547285 G COV2 2596 0.004941 9.564 2.548e-21
2 rs3826201 88651817 T COV3 2576 -0.0186 -15.7 4.335e-53
3 rs8047319 88684276 C ADD 2538 0.01115 1.271 0.204
3 rs8047319 88684276 C COV1 2538 0.2632 23.73 1.402e-112
3 rs8047319 88684276 C COV2 2538 0.005039 9.715 6.276e-22
3 rs8047319 88684276 C COV3 2538 -0.01891 -15.9 2.583e-54
或者您根本不需要指定现有的$ 1值:
$ awk 'NR>1{ if ($1!=p) {c++; p=$1} match($0,/(\s*\S+)(.*)/,a); $0=sprintf("%*s",length(a[1]),c) a[2]} 1' file
CHR SNP BP A1 TEST NMISS BETA STAT P
1 rs10173732 31404 A ADD 2607 -0.02162 -1.552 0.1207
1 rs10173732 31404 A COV1 2607 0.2659 24.15 1.849e-116
1 rs11684864 2547285 G ADD 2596 -0.009581 -0.6387 0.5231
1 rs11684864 2547285 G COV1 2596 0.2672 24.18 1.212e-116
1 rs11684864 2547285 G COV2 2596 0.004941 9.564 2.548e-21
2 rs3826201 88651817 T COV3 2576 -0.0186 -15.7 4.335e-53
3 rs8047319 88684276 C ADD 2538 0.01115 1.271 0.204
3 rs8047319 88684276 C COV1 2538 0.2632 23.73 1.402e-112
3 rs8047319 88684276 C COV2 2538 0.005039 9.715 6.276e-22
3 rs8047319 88684276 C COV3 2538 -0.01891 -15.9 2.583e-54
答案 1 :(得分:1)
您可以在sed
中进行替换:
sed 's/^\( *\)2/\11/ ; s/^\( *\)8/\12/ ; s/^\( *\)16/\1 3/'
这可以在一个脚本中完成所有三个替换。 ^\( *\)
捕获行开头和数字前的所有空格。 \1
替换它们以保留缩进。我出于同样的原因将16
替换为3
。