我的数据集在很多星期内在x多个位置都有几个项目的价格。我希望创建一个变量,对于任何给定的位置/周,显示原始价格(列出的第一个)。我的数据集看起来像这样:
data have;
input item $ location $ week price;
cards;
X NC 1 10
X NC 2 10
X NC 3 9.75
X SC 2 8
X SC 3 5
Y NC 1 100
Y NC 2 75
Y NC 3 50
Y NC 4 50
;
run;
我想要一个如下所示的数据集:
data want;
input item $ location $ week price start_price;
cards;
X NC 1 10 10
X NC 2 10 10
X NC 3 9.75 10
X SC 2 8 8
X SC 3 5 8
Y NC 1 100 100
Y NC 2 75 100
Y NC 3 50 100
Y NC 4 50 100
;
run;
我知道我可以使用第一个。变量以某种方式执行此操作,但无法对其进行排序。帮助
我尝试了这个,但看起来我需要多个组来获取正确的位置....我需要连接项目/位置还是有更优雅的方法来做到这一点?
data want;
set have;
by item;
if first.item then start_price=price;
start_price+0;
run;
答案 0 :(得分:3)
我会使用retain
来保留最后一行的值。结果与+ 0
的sum语句相同,但我认为更有意义。
如果我正确理解问题,您需要first.location来设置start_price。只需使用by item location;
即可。
data want;
set have;
by item location;
retain start_price;
if first.location then start_price=price;
run;
答案 1 :(得分:1)
为了演示如何获得组中的第一条记录,这里也是一个PROC SQL解决方案
data have;
input item $ location $ week price;
cards;
X NC 1 10
X NC 2 10
X NC 3 9.75
X SC 2 8
X SC 3 5
Y NC 1 100
Y NC 2 75
Y NC 3 50
Y NC 4 50
;
run;
首先使用单独的查询语句
proc sql;
create table START_PRICE as
select Item, Location, Price as Start_Price
from HAVE a
where Week =
(select min(week)
from have b
where a.item=b.item and a.location=b.location)
order by a.item, a.location;
Create table WANT as
Select a.item, a.location, a.week, a.price, b.start_price
From HAVE a left join START_PRICE b
on a.item=b.item and a.location=b.location
order by a.item, a.location, a.week;
Quit;
然后作为一个查询
Proc Sql ;
Create table WANT2 as
Select a.Item, a.Location, a.Week, a.Price, b.Start_Price
from HAVE a
Left Join
(select Item, Location, Price as Start_Price
from HAVE a1
where Week =
(select min(week) from have b1
where a1.item=b1.item and a1.location=b1.location)
) b
on a.item=b.item and a.location=b.location
order by a.item, a.location, a.week;
Quit;