基于过滤不同变量组合得到计数的Tidyverse解法

时间:2017-04-30 06:16:05

标签: r dplyr tidyverse purrr

 Library(tidyverse)

使用下面的代码,我想使用table()或dplyr来获取Sat变量的计数(Q1Sat,Q2Sat,Q3Sat)。但是,Q1Sat与Q1Used变量有关,Q2Sat与Q2Used有关,而Q3Sat与Q3Used有关。我想在每个组合的Used变量中过滤掉“No”,以及House变量中的“No”。

因此,例如,要计算Q1Sat的计数,我需要在Q1used和House中过滤掉“否”。对于Q2Sat,我需要在Q2Used和House中过滤掉“No”,而对于Q3 Sat,我必须在Q3Used和House中过滤掉“No”。

使用Tidyverse,有什么简单的方法可以实现这一目标? (最少量的代码)。我想使用最新版本的Tidyverse软件包,包括dep版本的dplyr,如果有必要的话。

Q1Sat<-c("Neutral","Neutral","VSat","Sat","Neutral","Sat","VDis","Sat","Sat","VSat")
Q2Sat<-c("Neutral","VSat","Dis","Dis","VDis","Sat","Sat","VSat","Neutral","Dis")
Q3Sat<-c("Sat","Sat","Diss","Neutral","VSat","VDis","Sat","Sat","Sat","Neutral")
Q3Used<-c("Yes","No","Yes","Yes","Yes","Yes","Yes","Yes","Yes","No")
Q2Used<-c("Yes","Yes","Yes","Yes","No","No","Yes","Yes","Yes","Yes")
Q1Used<-c("Yes","Yes","Yes","No","No","Yes","Yes","Yes","No","Yes")
House<-c("Yes","No","Unsure","Yes","Yes","No","Unsure","Unsure","Yes","Yes")

Test<-data_frame(Q1Sat,Q2Sat,Q3Sat,Q1Used,Q2Used,Q3Used,House)

2 个答案:

答案 0 :(得分:1)

Test %>% 
mutate(q1 = ifelse(Q1Used=="Yes", Q1Sat, NA), 
       q2 = ifelse(Q2Used=="Yes", Q2Sat, NA), 
       q3 = ifelse(Q3Used=="Yes", Q3Sat, NA)) %>% 
select(q1:q3) %>% 
sapply(., table)

$q1

Neutral     Sat    VDis    VSat 
      2       2       1       2 

$q2

    Dis Neutral     Sat    VSat 
      3       2       1       2 

$q3

   Diss Neutral     Sat    VDis    VSat 
      1       1       4       1       1 

答案 1 :(得分:1)

以下是使用data.table的选项。我们转换了&#39; data.frame&#39;到&#39; data.table&#39; (setDT(Test)),将其重塑为“长期&#39;通过在patterns中指定melt,按&#39; Qs&#39;分组和&#39;周六&#39;,得到计数在哪里&#39;使用&#39;是的&#39;是&#39;并重新塑造它回到广泛的&#39;格式

library(data.table)
dcast(melt(setDT(Test), measure = patterns("Sat", "Used"), 
   value.name = c("Sat", "Used"), variable.name = 'Qs')[
   Used == "Yes", .N , .(Qs, Sat)], Qs~Sat, fill=0)[, Qs := nm1[Qs][]
#   Qs Dis Diss Neutral Sat VDis VSat
#1: Q1   0    0       2   2    1    2
#2: Q2   3    0       2   1    0    2
#3: Q3   0    1       1   4    1    1

此外,我们可以使用base R

更紧凑地执行此操作
un1 <- unique(unlist(Test[1:3]))
t(mapply(function(x,y) table(factor(x[y == "Yes"], levels = un1)), Test[1:3], Test[4:6]))

或者更加紧凑

table(col(Test[1:3]), unlist(replace(Test[1:3], Test[4:6]!= "Yes", NA)))
#    Dis Diss Neutral Sat VDis VSat
#1   0    0       2   2    1    2
#2   3    0       2   1    0    2
#3   0    1       1   4    1    1