Question

我试图通过将数据的某些行添加到另一行（以另一列的形式）中来删除它们。有没有一种方法可以按特定的变量将行分组在一起？

我尝试在dplyr软件包中使用group_by语句，但是它似乎无法解决我的问题。

library(dplyr)
late <- read.csv(file.choose())
late <- group_by(late, state, add = FALSE)

我现在拥有的数据集（命名为“ late”）采用以下形式：

ontime   state   count

0        AL        1

1        AL        44

null     AL        3

0        AR        5

1        AR        50

...

但是我希望是这样

state    count0    count1    countnull

AL       1         44        3

AR       5         50        null

...

最终，我想为每个状态计算count0 / count1。因此，如果有更好的解决方案，我将欢迎任何建议。

Answer 1

您可以使用dcast()软件包中的reshape2来完成此操作

library(reshape2)

df = data.frame(
  ontime = c(0,1,NA,0,1),
  state = c("AL","AL","AL","AR","AR"),
  count = c(1,44,3,5,50)
)

dcast(df,state~ontime,value=count)

Answer 2

使用spread：

library(dplyr)
library(tidyr)

df %>%
  mutate(ontime = paste0('count', ontime)) %>%
  spread(ontime, count)

输出：

  state count0 count1 countnull
1    AL      1     44         3
2    AR      5     50        NA

数据：

df <- structure(list(ontime = structure(c(1L, 2L, 3L, 1L, 2L), .Label = c("0", 
"1", "null"), class = "factor"), state = structure(c(1L, 1L, 
1L, 2L, 2L), .Label = c("AL", "AR"), class = "factor"), count = c(1L, 
44L, 3L, 5L, 50L)), class = "data.frame", row.names = c(NA, -5L
))

是否有R函数按特定变量对表进行分组？

2 个答案: