在SPSS Statistics语法文件中,我希望创建一个变量,该变量根据所需的分区列(例如,等效于SQL“在Oracle SQL开发人员中排名(按column_a按b划分列))”来计算排名。 / p>
请参见示例:
没有任何过滤器的初始数据:
应用get_rank后的最终输出:
答案 0 :(得分:0)
To create a rank variable as described, first sort your data and then use the LAG
function.
SORT CASES BY column_a column_b .
IF ($CASENUM=1) rank=1 .
IF ($CASENUM>1 AND column_a~=LAG(column_a)) rank=1 .
IF ($CASENUM>1 AND column_a=LAG(column_a)) rank=LAG(rank) + 1 .
EXE .
LAG
will look at the value of column_a for the prior case. In the syntax above it checks whether the value in column_a is different from that of the prior case.
If it has, then it will set the rank to 1. If it hasn't, then it will add 1 to the rank of the prior case. Just make sure your data is properly sorted first.
From there, if you want to look only at records that are rank=1, you can either use FILTER BY
or SELECT IF
to do that.
答案 1 :(得分:0)
如果确实只需要过滤key=1
的密钥,则可以使用以下方法:
SORT CASES BY column_a column_b .
match files /file=* /by column_a /first=key1.
现在,变量key1
对于column_a
类别的第一次出现将具有值1,您可以将其用于filter
或select
。
对于完整排名变量,您可以使用此变量(甚至不需要先排序):
RANK VARIABLES=b (A) BY a /RANK /TIES=MEAN.