Question

SampleTable：

ID      Score1      Score2
1       100         88
1       96          94
1       94          95
2       100         100
2       98          94
3       77          88

所以我希望返回值为2，因为有2个唯一的人有一个实例，其中Score1＆gt; Score2。

对于可重复性：

df = data.frame( ID=c(1,1,1,2,2,3), Score1=c(100,96,94,100,98,77), Score2=c(88,94,95,100,94,88) )
ID Score1 S

我在想

length( unique( which( df$Score1 > df$Score2 ) ) )

然而，返回3，显然是因为它没有考虑寻找唯一的df$ID，只考虑唯一出现的数量。我如何考虑是否需要唯一的唯一df$ID？

Answer 1

I think you're looking for this in base R:

length(unique(df$ID[df$Score1 > df$Score2]))
[1] 2

Or using data.table:

library(data.table)
setDT(df)[Score1 > Score2, uniqueN(ID)]

Or dplyr:

library(dplyr)
df %>% filter(Score1 > Score2) %>% { n_distinct(.$ID) }

Answer 2

Building up on your code, get unique on ID

length(unique(df[df$Score1>df$Score2,1]))

R - 当Col2>时数据表的Col1中的数量唯一Vals。 COL3

2 个答案: