所以我有以下数据
ID | Status
________________
1 | In Progress
2 | In Progress
3 | Done
3 | In Progress
4 | Backlog
5 | Backlog
5 | In Progress
6 | Done
7 | Backlog
7 | In Progress
7 | Done
但是,根据“状态”列中的信息,当有多个ID时,我希望只有一个ID。因此,对应于ID 3,我们有状态Done
和In Progress
。在这里,我想保留Done
并弃掉In Progress
。对于ID 7,我想保留Done
并丢弃其他两种状态。
所以最终的结果是:
ID | Status
________________
1 | In Progress
2 | In Progress
3 | Done
4 | Backlog
5 | In Progress
6 | Done
7 | Done
问题在于ID 5,例如当它不是Done
时,却是In Progress
。
我试图用CASE WHEN语句来做,但是因为我给它一个重要的顺序它也保留了第二个选项。所以,如果我愿意:
SELECT CASE WHEN Status = 'Done' THEN 1
WHEN Status = 'In Progress' THEN 1
WHEN Status = 'Backlog' THEN 1
ELSE 0
END
但是我想只保留最重要的一个,所以它应该采用7 | Done
然后忽略其他两种状态。但是,对于5,它需要In Progress
。
有什么想法吗?
答案 0 :(得分:1)
您对case
表达的看法很好。您希望将其与聚合相结合。
对于这个问题:
SELECT id,
(CASE WHEN SUM(CASE WHEN Status = 'Done' THEN 1 ELSE 0 END) > 0 THEN 'Done'
WHEN SUM(CASE WHEN 'In Progress' THEN 1 ELSE 0 END) > 0 THEN 'In Progress'
WHEN SUM(CASE WHEN Status = 'Backlog' THEN 1 ELSE 0 END) > 0 THEN 'Backlog'
ELSE 'Unknown'
END) as status
FROM t
GROUP BY id
答案 1 :(得分:1)
另一种解决方案显然会对状态进行排名并选择最高的状态。
查询
proc sql;
create table want as
select distinct id, status,
case
when status = 'Done' then 3
when status = 'In Progress' then 2
when status = 'Backlog' then 1
else 0
end as rank
from
have
group by id
having rank = max(rank);
如果您不希望计算出的排名值使用want(drop=rank)
或嵌套查询,只能从中选择id
和status
。
数据
data have;
infile cards dlm='|';
input id status $20.; datalines;
1 | In Progress
2 | In Progress
3 | Done
3 | In Progress
4 | Backlog
5 | Backlog
5 | In Progress
6 | Done
7 | Backlog
7 | In Progress
7 | Done
run;
答案 2 :(得分:0)
以下是不使用PROC SQL的解决方案
data have;
input ID &status & $15.;
cards;
1 In Progress
2 In Progress
3 Done
3 In Progress
4 Backlog
5 Backlog
5 In Progress
6 Done
7 Backlog
7 In Progress
7 Done
;
run;
/ *为每个状态添加一定程度的重要性* /
data have;
set have;
if status ="Backlog" then importance=1;
if status ="In Progress" then importance=2;
if status ="Done" then importance=3;
run;
proc sort data=have;
by ID importance;
run;
data want;
set have;
by ID;
if last.ID;
drop importance;
run;