如何将txt文件一起添加,然后删除重复项

时间:2015-07-28 15:10:58

标签: csv command-line duplicates

我的数据格式为

文件1:

rsID   score   score2
rs145  1.4     0.67
rs561  0.45    1.23
rs607  1.98    0.12

file2的:

rsID   score   score2
rs561  0.45    1.23
rs234  1.74    0.22
rs256  1.09    0.34

file3的:

   rsID   score   score2
   rs234   1.74   0.22
   rs109   1.44   0.80
   rs780   0.45   0.91

file4将:

   rsID   score   score2
   rs234  1.74    0.22
   rs500  0.56    0.67
   rs614  0.81    0.50

我想将所有这些添加到一起来获取(只需在另一个的底部添加一个,但删除所有重复的行):

   rsID   score   score2
   rs145  1.4     0.67
   rs561  0.45    1.23
   rs607  1.98    0.12
   rs234  1.74    0.22
   rs256  1.09    0.34
   rs109  1.44    0.80
   rs780  0.45    0.91
   rs500  0.56    0.67
   rs614  0.81    0.50

我成功地将文件与cat数据一起添加。*> data.full。我仍然需要删除重复项

1 个答案:

答案 0 :(得分:0)

I was able to find the answer myself.

To add the data together I used cat data.* > data.full This worked because all my data was saved as data.file1, data.file2, data.fil3, etc.

Then I got rid of the duplicates with sort -u data.full > data.sorted.full which only keeps unique data. I think it worked!