在使用Tidyr重塑之前替换数据帧多响应列中的“x”

时间:2016-06-27 05:05:08

标签: r dplyr reshape2 tidyr

下面是一个简单的数据框。

Program <- c("A","B","C","D","E")
Apartment <- c("x","","","x","")
House <- c("x","","x","","")
Condo <- c("","x","","","x")
Cat <- c("x","","x","","")
Dog <- c("","x","","","")
Fish <- c("","x","","x","x")

DF1 <- data.frame(Program,Apartment,House,Condo,Cat,Dog,Fish)

使用此数据框,我想使用Tidyr创建下表。该表提供了住宿每个宠物数量的计数。所以对于有公寓的人来说,有一个猫的实例,还有一个鱼的实例。

为了实现这一点,我首先必须在融化数据之前用每个列的宠物名称替换“x”。我想知道如何在一行代码或一个函数中的所有列中执行此操作。

我也无法使用Tidyr或Reshape2以下面的确切形式创建表格。 (下表并不完全排列,但每个数字应低于宠物名称。所以对于第一行,1应该在cat下,0在dog下,1在fish下等。 ..)

      variable         Cat      Dog    Fish 
1     Apartment          1        0       1
2     House              2        0       0
3     Condo              0        1       2

2 个答案:

答案 0 :(得分:2)

我们可以尝试dplyr/tidyr

library(dplyr)
library(tidyr)
DF1 %>% 
    gather(House, Val, Apartment:Condo) %>% 
    filter(Val!="") %>% 
    gather(Animals, Val2, Cat:Fish) %>%
    group_by(House, Animals) %>%
    summarise_each(funs(sum(.!='')), Val:Val2)  %>%
    spread(Animals, Val2) %>%
    select(-Val)   
#      House   Cat   Dog  Fish
#      <chr> <int> <int> <int>
#1 Apartment     1     0     1
#2     Condo     0     1     2
#3     House     2     0     0

答案 1 :(得分:1)

基础版:

tmp <- data.frame(DF1[-1]=="x")
tmp <- data.frame(stack(tmp[1:3]), tmp[4:6])

aggregate(cbind(Cat,Dog,Fish) ~ ind, data=tmp, subset=tmp$values, FUN=sum)

#        ind Cat Dog Fish
#1 Apartment   1   0    1
#2     Condo   0   1    2
#3     House   2   0    0