我有这样的数据集
subject stim TCR
Chronic HIV no stim TRBV10-1TRBJ2-3TRAV1-2TRAJ33
Healthy control no stim TRBV10-1TRBJ2-4TRAV1-2TRAJ33
Chronic HIV no stim TRBV10-2TRBJ2-2TRAV1-2TRAJ33
Healthy control 1100-2 stim TRBV10-2TRBJ2-6TRAV1-2TRAJ33
Healthy control 1100-2 stim TRBV15TRBJ1-5TRAV1-2TRAJ33
Elite controller 1100-2 stim TRBV15TRBJ1-5TRAV1-2TRAJ33
Healthy control no stim TRBV15TRBJ1-5TRAV1-2TRAJ33
我想匹配TCR值并给出克隆号。因此,匹配序列应具有相同的克隆号。我希望结果会像这样
subject stim TCR Clone number
Chronic HIV no stim TRBV10-1TRBJ2-3TRAV1-2TRAJ33 Clone 1
Healthy control no stim TRBV10-1TRBJ2-4TRAV1-2TRAJ33 Clone 2
Chronic HIV no stim TRBV10-2TRBJ2-2TRAV1-2TRAJ33 Clone 3
Healthy control 1100-2 stim TRBV10-2TRBJ2-6TRAV1-2TRAJ33 Clone 4
Healthy control 1100-2 stim TRBV15TRBJ1-5TRAV1-2TRAJ33 Clone 5
Elite controller 1100-2 stim TRBV15TRBJ1-5TRAV1-2TRAJ33 Clone 5
Healthy control no stim TRBV15TRBJ1-5TRAV1-2TRAJ33 Clone 5
我尝试使用以下代码找出重复的序列
df%>% filter(duplicated(TCR, .keep_all=TRUE)
但这不符合我的目的。