Question

使用这些示例：

文件 1：

      rs12124819     1        0.020242          776546 A G
      rs28765502     1        0.022137          832918 T C
       rs7419119     1        0.022518          842013 T G
        rs950122     1        0.022720          846864 G C

文件 2：

1_752566    1   0   752566  G   A
1_776546    1   0   776546  A   G
1_832918    1   0   832918  T   C
1_842013    1   0   842013  T   G

如果它们的第 4 列相等，我正在尝试将 file2 的第一列更改为 file1 的相应第一列。

预期输出：

rs12124819  1   0   752566  G   A
rs28765502  1   0   776546  A   G
rs7419119   1   0   832918  T   C
rs950122    1   0   842013  T   G

我尝试创建 2 个数组，但找不到正确的使用方法：

awk 'FNR==NR{a[$4],b[$1];next} ($4) in a{$1=b[FNR]}1' file1 file2  > out.txt

非常感谢！

Answer 1

根据您展示的样品，您可以尝试以下操作吗？在 GNU awk 中编写和测试。

awk 'FNR==NR{a[$4]=$1;next} ($4 in a){$1=a[$4]} 1' file1 file2

说明：为以上添加详细说明。

awk '            ##Starting awk program from here.
FNR==NR{         ##Checking condition if FNR==NR which will be TRUE when file1 is being read.
  a[$4]=$1       ##Creating array a whose index is $4 and value is $1.
  next           ##next will skip all further statements from here.
}
($4 in a){       ##Checking condition if 4th field is present in a then do following.
  $1=a[$4]       ##Setting value of 1st field of file2 as array a value with index of 4th column
}
1                ##1 will print edited/non-edited line.
' file1 file2    ##mentioning Input_file names here.

Answer 2

你可以试试这个awk：

awk 'FNR==NR {map[FNR] = $1; next} {$1 = map[FNR]} 1' file1 file2 | column -t

rs12124819  1  0  752566  G  A
rs28765502  1  0  776546  A  G
rs7419119   1  0  832918  T  C
rs950122    1  0  842013  T  G

Answer 3

另一种选择（如果文件按示例数据中的连接键排序）

$ join -j4 -o1.1,2.2,2.3,2.4,2.5,2.6 file1 file2  | column -t

rs12124819  1  0  776546  A  G
rs28765502  1  0  832918  T  C
rs7419119   1  0  842013  T  G

请注意，您的输入文件只有 3 个匹配的记录。

用条件替换两个文件 awk

3 个答案: