R Dataframe中的NRow问题唯一逐行计数

时间:2017-10-12 21:03:49

标签: r

这是我的数据:

df <- data.frame(id=c(1,2,3,4,5,6,7), Group1=c(1,0,1,0,1,0,0),Group2=c(0,2,0,0,0,2,0),Group3=c(0,0,3,0,0,0,0),Group4=c(0,0,0,0,0,0,4),Group5=c(5,0,0,5,0,0,0), State=c("MD","VA","VA","VA","NC","VT","MD"))

我正在尝试在此数据框中创建一个字段,该字段按行计算在少数列中等于0以外的值的次数。

我试过了:

df$count <- rowSums(df$Group1== 1|df$Group2== 2|df$Group3== 3| df$Group4== 4| df$Group5== 5)

并收到此错误:

Error in rowSums(df$Group1 == 1 | df$Group2 == 2 | df$Group3 == 3 | df$Group4 ==  : 
  'x' must be an array of at least two dimensions

我想要的最终结果如下:

ID  Group1  Group2  Group3  Group4  Group5  State   count
1   1       0       0       0       5       MD      2
2   0       2       0       0       0       VA      1
3   1       0       3       0       0       VA      2
4   0       0       0       0       5       VA      1
5   1       0       0       0       0       NC      1
6   0       2       0       0       0       VT      1
7   0       0       0       4       0       MD      1

1 个答案:

答案 0 :(得分:1)

您收到的错误是由于df$Group1== 1|df$Group2== 2|df$Group3== 3| df$Group4== 4| df$Group5== 5返回了logical的向量,因此您无法将rowSums应用于该向量。

以下选项在这里应该可以正常工作。

## option 1
for (i in 1:nrow(df)){
  df$count[i] <- rowSums(df[i, c("Group1", "Group2", "Group3", "Group4", "Group5")] != 0)
}

## option 2
library(data.table)
setDT(df)
df[, count := rowSums(df[, c("Group1", "Group2", "Group3", "Group4", "Group5")] != 0)]