我希望在各行之间创建一个指标来查看公司是否销售苹果。例如,给定一个数据帧:
Company | Product | Salesperson
A Apple John
A Banana John
A Orange Jane
B Orange John
B Banana Sam
我想创建一个dummyvar列来标记所有公司A的1,因为John在那里卖苹果
Company | Product | Salesperson | IND
A Apple John 1
A Banana John 1
A Orange Jane 1
B Orange John 0
我想在sas或SQL中执行此操作。
答案 0 :(得分:4)
在PROC SQL中很容易做到,因为SAS会自动使用摘要统计信息重新合并详细信息行。布尔表达式的计算结果为0/1,因此只需使用MAX()来确定表达式是否为真。
proc sql ;
create table want as
select *,max(product='Apple') as IND
from have
group by company
;
quit;
答案 1 :(得分:0)
MS SQL Server :解决方案可能就是这样:
declare @tbl as table (
company varchar(1)
,product varchar(10)
,salesPerson varchar(10)
)
insert into @tbl values ('A', 'Apple', 'John')
insert into @tbl values ('A', 'Banana', 'John')
insert into @tbl values ('A', 'Orange', 'Jane')
insert into @tbl values ('B', 'Orange', 'John')
insert into @tbl values ('B', 'Banana', 'Sam')
SELECT
company
,product
,salesPerson
,CASE WHEN
company IN (SELECT company FROM @tbl WHERE product = 'Apple' AND salesPerson = 'John') THEN 1
ELSE 0
END AS col
FROM @tbl
答案 2 :(得分:0)
假设表名为X且IND列已存在,且值为空。
update X
SET IND = 1
WHERE Company IN (select distinct(Company) from X where Product = 'Apple' AND Salesperson = 'John')
update X
SET IND = 0
WHERE Company NOT IN (select distinct(Company) from X where Product = 'Apple' AND Salesperson = 'John')