计算col中col1中每个不同值的col2中的不同值的数量

时间:2014-08-25 10:41:47

标签: r

我有一个这样的数据框:

df <- data.frame(
          SchoolID=c("A","A","B","B","C","D"),
          Country=c("XX","XX","XX","YY","ZZ","ZZ"))

给了我这个数据:

    SchoolID    Country
1   A           XX
2   A           XX
3   B           XX
4   B           YY
5   C           ZZ
6   D           ZZ

我想知道每个SchoolID是否通过查找SchoolID的每个不同值,Country的唯一值的数量来唯一地分配Country。所以我想获得这样的表格:

SchoolID   NumberOfCountry
A          1
B          2
C          1
D          1

3 个答案:

答案 0 :(得分:3)

aggregate(Country ~ SchoolID, df, function(x) length(unique(x)))

或者

tapply(df$Country, df$SchoolID, function(x) length(unique(x)))

或者

library(data.table) 
setDT(df)[, .(NumberOfCountry = length(unique(Country))), by = SchoolID]

v >1.9.5

setDT(df)[, .(NumberOfCountry = uniqueN(Country)), by = SchoolID]

或者

library(dplyr)
df %>% 
  group_by(SchoolID) %>% 
  summarise(NumberOfCountry = n_distinct(Country))

答案 1 :(得分:1)

一种不依赖第三方库的方法:

> as.data.frame(rowSums(table(df[!duplicated(df), ]), na.rm=T))
  rowSums(table(df[!duplicated(df), ]), na.rm = T)
A                                                1
B                                                2
C                                                1
D                                                1

答案 2 :(得分:-1)

试试这个..

select School,count(Country)
from(
select distinct School,Country
from tbl_stacko) temp
group by School