我有一个包含两列ID(ID_A和ID_B)的数据集。每行包含两个我认为属于同一个人的ID。因此,每个组合显示两次。例如:
ID_A ID_B
A B
C D
B A
D C
我要删除重复项。即如果我有A,B行,则不需要B,A行。
ID_A ID_B
A B
C D
您知道如何在SAS中执行此操作吗?
答案 0 :(得分:2)
怎么样...
data have;
input (ID_A ID_B)($);
cards;
A B
C D
B A
D C
;;;;
run;
data haveV / view=haveV;
set have;
call sortc(of id:);
run;
proc sort nodupkey out=want;
by id:;
run;
proc print;
run;
答案 1 :(得分:0)
我喜欢@data null 答案是完美而强大的。您也可以尝试proc sql,如下所示
proc sql;
create table want as
select distinct
case when ID_A le ID_B then ID_A else ID_B end as ID_A,
case when ID_A ge ID_B then ID_A else ID_B end as ID_B
from have;