Question

我想知道如何消除多个文件中的重复行。我使用此命令来获取重复的行，但它只显示了共同的行：

sort *.txt | uniq -d | fgrep -f - *.txt | sort -t : -k 2

例如，如果我有以下文件：

FILE1.TXT：

AAA
BBB
CCC

FILE2.TXT：

AAA
EEE
FFF

file3.txt：

BBB
ZZZ
...

file20.txt：

AAA
BBB
TTT

我希望得到结果：

FILE1.TXT：

AAA
BBB
CCC

FILE2.TXT：

EEE
FFF

file3.txt：

ZZZ
....

file20.txt：

TTT

Answer 1

请勿使用-d的{{1}}标记。这只会显示重复的行。

来自uniq：

uniq --help

相反，请使用不带参数的-d, --repeated only print duplicate lines, one for each group：

uniq

或者，更简单地说，sort *.txt | uniq | ...可以为您统一：

sort

Answer 2

properties>andriod

即使存在于多个文件和/或同一文件中，

也只会打印一次任何字符串

添加了新的OP约束（每个原始文件中的输出）

awk '!Line[$0]++' *.txt

对于awk中的重定向限制，输出会从awk '!Line[$0]++ > ( FILENAME ".new" )' *.txt重定向到FileX.txt。原始文件可以使用一些更改（不直接是请求的目的）

Answer 3

您可以在Vim中执行此操作：打开gvim（例如），将所有文件作为参数。
然后

将以下代码复制到剪贴板中

let g:duplicate_finder={}
function Remove_duplicates()
    " Get the buffer lines
    let buf_lines = getline(1, '$')
    " Reduce the buffer to one empty line
    execute '%d _'
    " Append to the buffer only lines never encountered before
    for cur_buf_line in buf_lines
        if !has_key(g:duplicate_finder, cur_buf_line)
            call append(line('$'), cur_buf_line)
            let g:duplicate_finder[cur_buf_line] = '1'
        endif
    endfor
    " Delete first line from the buffer
    execute '1d _'
endfunction
argdo call Remove_duplicates()

和

在gVim窗口中，键入:@+ return 以运行代码。

另一种选择是：

将上面的代码保存到名为remove_duplicates.vim和
在gVim窗口中，键入:source /path/to/remove_duplicates.vim return 。

要保存所有缓冲区，请运行:xa return

使用awk或sed消除多个文件中的重复行

3 个答案: