我有一个数据集,要求我第一次找到注册的在线购物者购买的东西并对该购买应用5%的折扣。
数据集有28列但是出于这个问题的目的,我将把它浓缩为我认为相关的内容。
我需要创建一个新列,告诉我第一次有人购买了什么。我们可以假设在同一天进行的购买是相同的购买但属于不同的商品。
Obs ID Trans_Date Order_Number Value Status
----------------------------------------------------------------
1874 866 30/07/2016 191 $4,217.90 Registered
1875 866 30/07/2016 191 $4,217.90 Registered
1876 866 31/07/2016 192 $2,422.75 Registered
1877 866 31/07/2016 192 $2,422.75 Registered
1878 . 31/07/2016 193 $4,162.66 Unregistered
1879 . 31/07/2016 193 $4,162.66 Unregistered
1880 344 31/07/2016 194 $4,405.51 Registered
1881 344 31/07/2016 194 $4,405.51 Registered
1882 . 31/07/2016 195 $2,114.76 Unregistered
1883 . 31/07/2016 195 $2,114.76 Unregistered
1884 250 31/07/2016 196 $3,310.72 Registered
1885 250 31/07/2016 196 $3,310.72 Registered
1886 . 31/07/2016 197 $4,633.48 Unregistered
1887 . 31/07/2016 197 $4,633.48 Unregistered
1888 . 31/07/2016 197 $4,633.48 Unregistered
1889 . 31/07/2016 197 $4,633.48 Unregistered
1890 . 31/07/2016 198 $6,224.43 Unregistered
1891 . 31/07/2016 198 $6,224.43 Unregistered
1892 . 31/07/2016 198 $6,224.43 Unregistered
1893 . 31/07/2016 198 $6,224.43 Unregistered
答案 0 :(得分:0)
这是我的'第一个'数据集:
'obs' , 'id' , 'trans_date' , 'order_number' , 'value' , 'status'
1874 , 866 , 30/07/2016 , 191 , 4217.90 , Registered
1875 , 866 , 30/07/2016 , 191 , 4217.90 , Registered
1876 , 866 , 31/07/2016 , 192 , 2422.75 , Registered
1877 , 866 , 31/07/2016 , 192 , 2422.75 , Registered
1878 , 344 , 30/07/2016 , 193 , 4162.66 , Unregistered
1879 , 344 , 30/07/2016 , 193 , 4162.66 , Unregistered
1880 , 344 , 31/07/2016 , 194 , 4405.51 , Registered
1881 , 344 , 31/07/2016 , 194 , 4405.51 , Registered
1882 , 250 , 30/07/2016 , 195 , 2114.76 , Unregistered
1883 , 250 , 30/07/2016 , 195 , 2114.76 , Unregistered
1884 , 250 , 31/07/2016 , 196 , 3310.72 , Registered
1885 , 250 , 31/07/2016 , 196 , 3310.72 , Registered
1886 , 275 , 30/07/2016 , 197 , 4633.48 , Unregistered
1887 , 275 , 30/07/2016 , 197 , 4633.48 , Unregistered
1888 , 275 , 30/07/2016 , 197 , 4633.48 , Unregistered
1889 , 275 , 30/07/2016 , 197 , 4633.48 , Unregistered
1890 , 275 , 31/07/2016 , 198 , 6224.43 , Unregistered
1891 , 275 , 31/07/2016 , 198 , 6224.43 , Unregistered
1892 , 275 , 31/07/2016 , 198 , 6224.43 , Unregistered
1893 , 275 , 31/07/2016 , 198 , 6224.43 , Unregistered
这里有一些proc sql:
proc sql noprint;
create table temp as
select *,min(trans_date) format=date9. as first
from first
group by id
order by order_number;
create table final as
select obs,id,trans_date,order_number,value,status,
case when first = trans_date then 'FIRST'
else 'NOT FIRST'
end as flag
from temp;
quit;