awk:根据另一列的值有条件地更改字段的值

时间:2018-05-07 03:25:25

标签: unix if-statement awk conditional text-processing

我有一个表def find_bob(s): check_list = 'bob' c, n = 0, s.find(check_list) while n != -1: c += 1 n = s.find(check_list, n+1) return c In []: find_bob('azcbobobegghakl') Out[]: 2 ,其中第二个和第三个字段snp150Common.txt可以相等或不相等。

如果它们相同,我希望$2 and $3成为$2,以便:

$2-1

变为:

chr1    10177   10177   rs367896724 -   -   -/C insertion   near-gene-5
chr1    10352   10352   rs555500075 -   -   -/A insertion   near-gene-5
chr1    11007   11008   rs575272151 C   C   C/G single      near-gene-5
chr1    11011   11012   rs544419019 C   C   C/G single      near-gene-5
chr1    13109   13110   rs540538026 G   G   A/G single      intron
chr1    13115   13116   rs62635286  T   T   G/T single      intron
chr1    13117   13118   rs62028691  A   A   C/T single      intron
chr1    13272   13273   rs531730856 G   G   C/G single      ncRNA
chr1    14463   14464   rs546169444 A   A   A/T single      near-gene-3,ncRNA

我当前的命令改编自https://askubuntu.com/a/312843

chr1    10176   10177   rs367896724 -   -   -/C insertion   near-gene-5
chr1    10351   10352   rs555500075 -   -   -/A insertion   near-gene-5
chr1    11007   11008   rs575272151 C   C   C/G single      near-gene-5
chr1    11011   11012   rs544419019 C   C   C/G single      near-gene-5
chr1    13109   13110   rs540538026 G   G   A/G single      intron
chr1    13115   13116   rs62635286  T   T   G/T single      intron
chr1    13117   13118   rs62028691  A   A   C/T single      intron
chr1    13272   13273   rs531730856 G   G   C/G single      ncRNA
chr1    14463   14464   rs546169444 A   A   A/T single      near-gene-3,ncRNA

给出相同的输出:

zcat < snp150/snp150Common.txt.gz | head | awk '{ if ($2 == $3) $2=$2-1; print $0 }' | cut -f 2,3,4,5,8,9,10,12,16

非常感谢任何帮助。

1 个答案:

答案 0 :(得分:1)

这个答案是基于对源文件格式的纯粹推测:

$ zcat snp150/snp150Common.txt.gz | 
  awk '
  BEGIN { OFS="\t" }                       # field separators are most likely tabs
  {
      if ($3 == $4)                        # based on cut these should be compared
          $3=$3-1
      print $2,$3,$4,$5,$8,$9,$10,$12,$16  # ... and there fields printed
  }
  NR==10 { exit }'                         # this replaces head

请记住:练习(除了吸吮之外的任何东西)会让你少吃。