Question

我有两个文本文件，其中包含空格分隔值，我想根据两个文件中的键列组合文件，并在另一个文件中输出。
的 location.txt

data.txt ，其中有数万个数据，但我会在这里给出简单的条目，

2004-03-31 03:38:15.757551 2 1 122.153 -3.91901 11.04 2.03397
2004-02-28 00:59:16.02785 3 2 19.9884 37.0933 45.08 2.69964
2004-02-28 01:03:16.33393 11 3 19.3024 38.4629 45.08 2.68742
2004-02-28 01:06:16.013453 17 4 19.1652 38.8039 45.08 2.68742
2004-02-28 01:06:46.778088 18 5 19.175 38.8379 45.08 2.69964
2004-02-28 01:08:45.992524 22 6 19.1456 38.9401 45.08 2.68742

我尝试的是根据第1列来自location.txt 和第4列来自data.txt 的键值来合并这两个文件并获取结果格式如下，将来自data.txt的所有数据和第2列和第3列与location.txt相结合 ..

2004-03-31 03:38:15.757551 2 1 122.153 -3.91901 11.04 2.03397 21.5 23
2004-02-28 00:59:16.02785 3 2 19.9884 37.0933 45.08 2.69964 24.5 20
2004-02-28 01:03:16.33393 11 3 19.3024 38.4629 45.08 2.68742 19.5 19
2004-02-28 01:06:16.013453 17 4 19.1652 38.8039 45.08 2.68742 22.5 15
2004-02-28 01:06:46.778088 18 5 19.175 38.8379 45.08 2.69964 24.5 12
2004-02-28 01:08:45.992524 22 6 19.1456 38.9401 45.08 2.68742 19.5 12

我使用awk命令：

awk -F' ' "NR==FNR{label[$1]=$1;x[$1]=$2;y[$1]=$3;next}; ($2==label[$2]){print $0 "," x[$2] y[$3]}" location.txt data.txt > result.txt

但我没有像我预期的那样得到输出，任何人都可以帮我解决这个问题吗？我们可以用csv格式获取结果文件，空格替换为逗号吗？

Answer 1

使用bash并加入

join -1 1 -2 4 <(sort -k1,1 -n location.txt) <(sort -k4,4 -n data.txt) -o 2.1,2.2,2.3,2.4,2.5,2.6,2.7,2.8,1.2,1.3

输出：

2004-03-31 03:38:15.757551 2 1 122.153 -3.91901 11.04 2.03397 21.5 23
2004-02-28 00:59:16.02785 3 2 19.9884 37.0933 45.08 2.69964 24.5 20
2004-02-28 01:03:16.33393 11 3 19.3024 38.4629 45.08 2.68742 19.5 19
2004-02-28 01:06:16.013453 17 4 19.1652 38.8039 45.08 2.68742 22.5 15
2004-02-28 01:06:46.778088 18 5 19.175 38.8379 45.08 2.69964 24.5 12
2004-02-28 01:08:45.992524 22 6 19.1456 38.9401 45.08 2.68742 19.5 12

请参阅：man join

Answer 2

在awk中：

$ awk '
NR==FNR {                  # process location.txt
    a[$1]=$2 OFS $3        # hash using $1 as key
    next                   # next record
}
$4 in a {                  # process data.txt
    print $0,a[$4]         # output record and related location 
}' location.txt  data.txt  # mind the file order
2004-03-31 03:38:15.757551 2 1 122.153 -3.91901 11.04 2.03397 21.5 23
2004-02-28 00:59:16.02785 3 2 19.9884 37.0933 45.08 2.69964 24.5 20
2004-02-28 01:03:16.33393 11 3 19.3024 38.4629 45.08 2.68742 19.5 19
2004-02-28 01:06:16.013453 17 4 19.1652 38.8039 45.08 2.68742 22.5 15
2004-02-28 01:06:46.778088 18 5 19.175 38.8379 45.08 2.69964 24.5 12
2004-02-28 01:08:45.992524 22 6 19.1456 38.9401 45.08 2.68742 19.5 12

我们如何根据awk命令中的条件组合两个文件？

2 个答案: