Question

我有一个名为table_wo_nas的表，有多列，其中一列标题为ID。对于ID的每个值，有许多行。我想编写一个函数，输入x将输出一个包含每个ID的行数的数据帧，其中列标题ID和nobs分别如下所示x＆lt; - c（2,4,8）。

##   id nobs
## 1  2 1041
## 2  4  474
## 3  8  192

这就是我所拥有的。它适用于x是单个值（例如3），但不包含多个值时，例如1:10或c（2,5,7）。我收到警告＆＃34;在ID [柜台]＆lt; -x：要替换的项目数量不是替换长度的倍数＆＃34;。我刚刚开始学习R并且已经在这方面苦苦挣扎了一个星期，并且已经搜索了手册，这个网站，谷歌，一切。有人可以帮忙吗？

counter <- 1
ID <- vector("numeric") ## contain x
nobs <- vector("numeric") ## contain nrow
for (i in x) {
    r <- subset(table_wo_nas, ID %in% x) ## create subset for rows of ID=x
    ID[counter] <- x ## add x to ID
    nobs[counter] <- nrow(r) ## add nrow to nobs
    counter <- counter + 1 } ## loop
result <- data.frame(ID, nobs) ## create data frame

Answer 1

在基地R，

# To make a named vector, either:
tmp <- sapply(split(table_wo_nas, table_wo_nas$ID), nrow) 

# OR just:
tmp <- table(table_wo_nas$ID)

# AND
# arrange into data.frame
nobs_df <- data.frame(ID = names(tmp), nobs = tmp)

或者，直接将表强制转换为data.frame，并重命名：

nobs_df <- data.frame(table(table_wo_nas$ID))
names(nobs_df) <- c('ID', 'nobs')

如果您只想要某些行，则子集：

nobs_df[c(2, 4, 8), ]

还有很多很多选择;这些只是少数。

使用dplyr，

library(dplyr)
table_wo_nas %>% group_by(ID) %>% summarise(nobs = n())

如果您只想要某些ID，请添加filter：

table_wo_nas %>% group_by(ID) %>% summarise(nobs = n()) %>% filter(ID %in% c(2, 4, 8))

Answer 2

如果您再次使用table，似乎非常简单：

tbl <- table(    table_wo_nas[ , 'ID'] )
data.frame( IDs = names(tbl),  nobs= tbl)

虽然使用不同的列名称，但也可以快速回答：

as.data.frame(table( table_wo_nas[ , 'ID'] ))

Answer 3

试试这个。

  x=c(2,4,8)
  count_of_id=0
   #df is your data frame table_wo_nas
count_of<-function(x)
{for(i in 1 : length(x))
 {count_of_id[i]<-length(which(df$id==x[i])) #find out the n of rows for each unique value of x
  }
 df_1<-cbind(id,count_of_id)
return(df_1)
}

将具有多个值的向量传递到R函数以生成数据帧

3 个答案: