我的数据格式为
文件1:
rsID score score2
rs145 1.4 0.67
rs561 0.45 1.23
rs607 1.98 0.12
file2的:
rsID score score2
rs561 0.45 1.23
rs234 1.74 0.22
rs256 1.09 0.34
file3的:
rsID score score2
rs234 1.74 0.22
rs109 1.44 0.80
rs780 0.45 0.91
file4将:
rsID score score2
rs234 1.74 0.22
rs500 0.56 0.67
rs614 0.81 0.50
我想将所有这些添加到一起来获取(只需在另一个的底部添加一个,但删除所有重复的行):
rsID score score2
rs145 1.4 0.67
rs561 0.45 1.23
rs607 1.98 0.12
rs234 1.74 0.22
rs256 1.09 0.34
rs109 1.44 0.80
rs780 0.45 0.91
rs500 0.56 0.67
rs614 0.81 0.50
我成功地将文件与cat数据一起添加。*> data.full。我仍然需要删除重复项
答案 0 :(得分:0)
I was able to find the answer myself.
To add the data together I used cat data.* > data.full
This worked because all my data was saved as data.file1, data.file2, data.fil3, etc.
Then I got rid of the duplicates with sort -u data.full > data.sorted.full
which only keeps unique data. I think it worked!