Question

如果我有3个csv文件，并且我想将数据合并为一个，但彼此并排，我该怎么做？例如：

初始合并文件：

,,,,,,,,,,,,

文件1：

20,09/05,5694
20,09/06,3234
20,09/08,2342

文件2：

20,09/05,2341
20,09/06,2334
20,09/09,342

文件3：

20,09/05,1231
20,09/08,3452
20,09/10,2345
20,09/11,372

最终合并文件：

09/05,5694,,,09/05,2341,,,09/05,1231
09/06,3234,,,09/06,2334,,,09/08,3452
09/08,2342,,,09/09,342,,,09/10,2345
,,,,,,,,09/11,372

基本上每个文件的数据都会进入合并文件的特定列。我知道awk函数可以用于此，但我不知道如何开始

编辑：仅打印每个文件的第2列和第3列。我用这个打印出第2和第3列：

awk -v f="${i}" -F, 'match ($0,f) { print $2","$3 }' file3.csv > d$i.csv

但是，比方说，例如，file1和file2在该行中为空，该行的数据将向左移动。所以我想出了这个来解释这个转变：

awk -v x="${i}" -F, 'match ($0,x) { if ($2='/NULL') { print "," }; else { print $2","$3}; }' alld.csv > d$i.csv

Answer 1

paste已完成此操作：

$ paste -d";" f1 f2 f3 | sed 's/;/,,,/g'
09/05,5694,,,09/05,2341,,,09/05,1231
09/06,3234,,,09/06,2334,,,09/08,3452
09/08,2342,,,09/09,342,,,09/10,2345
,,,,,,09/11,372

请注意，paste仅输出一个逗号：

$ paste -d, f1 f2 f3
09/05,5694,09/05,2341,09/05,1231
09/06,3234,09/06,2334,09/08,3452
09/08,2342,09/09,342,09/10,2345
,,09/11,372

为了拥有多个分隔符，我们可以使用另一个分隔符，例如;，然后用,,,替换为sed：

$ paste -d";" f1 f2 f3 | sed 's/;/,,,/g'
09/05,5694,,,09/05,2341,,,09/05,1231
09/06,3234,,,09/06,2334,,,09/08,3452
09/08,2342,,,09/09,342,,,09/10,2345
,,,,,,09/11,372

Answer 2

使用GNU awk进行ARGIND：

$ gawk '{ a[FNR,ARGIND]=$0; maxFnr=(FNR>maxFnr?FNR:maxFnr) }
    END {
        for (i=1;i<=maxFnr;i++) {
            for (j=1;j<ARGC;j++)
                printf "%s%s", (j==1?"":",,,"), (a[i,j]?a[i,j]:",")
            print ""
        }
    }
' file1 file2 file3
09/05,5694,,,09/05,2341,,,09/05,1231
09/06,3234,,,09/06,2334,,,09/08,3452
09/08,2342,,,09/09,342,,,09/10,2345
,,,,,,,,09/11,372

如果您没有GNU awk，只需添加一个显示FNR==1{ARGIND++}的初始行。

每个请求的评论版本：

$ gawk '
    { a[FNR,ARGIND]=$0; # Store the current line in a 2-D array `a` indexed by
                        # the current line number `FNR` and file number `ARGIND`.

      maxFnr=(FNR>maxFnr?FNR:maxFnr)    # save the max FNR value
    }
    END{
        for (i=1;i<=maxFnr;i++) {  # Loop from 1 to max number of fields
                                   # seen across all files and for each:
            for (j=1;j<ARGC;j++)     # Loop from 1 to total number of files parsed and:
                printf "%s%s",         # Print 2 strings, specifically:
                   (j==1?"":",,,"),      # A field separator - empty if were printing
                                         # the first field, three commas otherwise.
                   (a[i,j]?a[i,j]:",")   # The value stored in the array if it was
                                         # present in the files, a comma otherwise.
            print ""                   # Print a newline
        }
    }
' file1 file2 file3

我最初使用数组fnr[FNR]来跟踪FNR的最大值，但恕我直言，这有点晦涩，它有一个缺陷，如果没有行，比如第二个字段，那么{{1}上的一个循环在for (i=1;i in fnr;i++)部分中，在进入第3场之前会拯救出来。

Answer 3

使用pr：

$ pr -mts',,,' file[1-3]
09/05,5694,,,09/05,2341,,,09/05,1231
09/06,3234,,,09/06,2334,,,09/08,3452
09/08,2342,,,09/09,342,,,09/10,2345
,,,,,,09/11,372

如何在csv文件中彼此相邻添加数据

3 个答案: