Question

我有一堆不同的文件。所有文件都包含标题为ID的列，但不一定在所有文件的同一位置。我有一个函数，我想在所有文件中应用ID，将它们更改为NEWID。

我知道如果我传入ID的列号，我可以非常简单地做到这一点，说它是5列文件中的第3列，类似于：

awk -v column=$COLNUMBER '{print $1, $2, FUNCTION($column), $4, $5}' FILE

但是，如果我的所有文件都有数百个列，并且每个文件中都有任意位置，那就非常繁琐了。我正在寻找一种方法来做一些事情：

awk -v column=$COLNUMBER '{print #All columns before $column, FUNCTION($column), #All columns after $column}' FILE

我尝试过不同的循环，但尚未开始工作。

Answer 1

简单：

$ awk -v column=$COLNUMBER '{ $column = FUNCTION($column); print }' $FILE

Answer 2

保留字段之间的间距：

$ cat file
a b   c      d e  f
$ gawk -v col=3 '{print gensub("([[:space:]]*([^[:space:]]+[[:space:]]+){" col-1 "})[^[:space:]]+","\\1FUNCTION($col)","")}' file
a b   FUNCTION($col)      d e  f

或者如果您实际上正在寻找要传递给FUNCTION（）的列值：

$ gawk -v col=3 '{print gensub("([[:space:]]*([^[:space:]]+[[:space:]]+){" col-1 "})([^[:space:]]+)","\\1FUNCTION(\\3)","")}' file
a b   FUNCTION(c)      d e  f

$ gawk -v col=4 '{print gensub("([[:space:]]*([^[:space:]]+[[:space:]]+){" col-1 "})([^[:space:]]+)","\\1FUNCTION(\\3)","")}' file
a b   c      FUNCTION(d) e  f

或：

$ gawk -v col=3 '{print gensub("([[:space:]]*([^[:space:]]+[[:space:]]+){" col-1 "})[^[:space:]]+","\\1FUNCTION($"col")","")}' file
a b   FUNCTION($3)      d e  f

$ gawk -v col=4 '{print gensub("([[:space:]]*([^[:space:]]+[[:space:]]+){" col-1 "})[^[:space:]]+","\\1FUNCTION($"col")","")}' file
a b   c      FUNCTION($4) e  f

以上使用GNU awk for gensub（），您可以使用多个sub（）s或match（）+ substr（）在其他awks中完成相同的操作。

从其他人的回答看起来你可能实际上想要对字段的值调用FUNCTION（），而不是打印FUNCTION（字段）。如果是这种情况那么你就是这样做：

$ gawk -v col=4 '{print gensub("([[:space:]]*([^[:space:]]+[[:space:]]+){" col-1 "})[^[:space:]]+","\\1"FUNCTION($col),"")}' file

e.g。如果FUNCTION是toupper（）：

$ gawk -v col=4 '{print gensub("([[:space:]]*([^[:space:]]+[[:space:]]+){" col-1 "})[^[:space:]]+","\\1"toupper($col),"")}' file
a b   c      D e  f

仅将函数应用于一列，具有可变位置

2 个答案: