在如何根据我的函数中的Vector合并数据帧行的重新排序时遇到了困难

时间:2018-02-04 08:13:16

标签: r tidyverse

library(tidyverse)
library(ggplot2) for diamonds dataset

我无法让我的功能发挥作用。在本例中,我使用钻石数据集格式ggplot2尝试做的是dplyr :: group_by" cut"和" color",然后dplyr :: summarize来获取计数。我使用rlang和purrr将两个计数摘要输出到列表中,然后重命名其中一个列,并用dplyr :: map_df绑定它们。最后,我想重新排序" Cut"列基于另一个名为" Order"的向量。该函数有效,直到我尝试合并行重新排序...

这可能对这些数据没有意义,但这只是一个例子,它对我的​​真实数据有意义。

无论如何,下面的代码有效......

Groups<-list("cut","color")

 Groups<-Groups%>%
 map_df(function(group){

     syms<-syms(group)

     diamonds%>%
         group_by(!!!syms)%>%
         summarise(Count=n())%>%
         set_names(c("Cut","Count"))
 })

接下来,我想根据&#34;订单&#34;重新排序行。矢量,也有效。

Order<-c("Good","Very Good","Premium","Ideal","Fair","E","F","G","D","H","J","I")

Groups%>%slice(match(Order, Cut))

但是,这就是我被困的地方。我试图在一个功能中完成所有这些,但它似乎不起作用。我觉得我错过了一些小事......

Fun<-function(df){

Order<-c("Good","Very Good","Premium","Ideal","Fair","E","F","G","D","H","J","I")

Groups<-list("cut","color")

 Groups<-Groups%>%
 map_df(function(group){

     syms<-syms(group)

     df%>%
         group_by(!!!syms)%>%
         summarise(Count=n())%>%
         set_names(c("Cut","Count"))%>%
         slice(match(Order,Cut))
return(df)
})
}

这是另一种尝试......

Fun<-function(df){

Order<-c("Good","Very Good","Premium","Ideal","Fair","E","F","G","D","H","J","I")

Groups<-list("cut","color")

 Groups<-Groups%>%
 map_df(function(group){

     syms<-syms(group)

     df%>%
         group_by(!!!syms)%>%
         summarise(Count=n())%>%
         set_names(c("Cut","Count"))

df<-df%>%slice(match(Order,Cut))
return(df)
})
}

我在这里缺少什么?

3 个答案:

答案 0 :(得分:3)

我们不需要在循环中应用syms。它可以采用长度大于1的向量/列表并将其转换为符号。因此,循环遍历syms,然后使用map在每个符号对象上执行group_by

Fun<-function(df){

Order<-c("Good","Very Good","Premium","Ideal","Fair","E","F","G","D","H","J","I")

Groups<-list("cut","color")

Groups %>%
       syms %>%
       map_df(~ df %>%
               group_by(!!!  .x) %>%
               summarise(Count=n()) %>%
               set_names(c("Cut","Count")) %>%
               slice(match(Order,Cut)) #%>%                    
               #mutate(Cut = as.character(Cut)) 
               #to avoid the warning coercion of factor to character 


      )




}

Fun(diamonds)
# A tibble: 12 x 2
#   Cut       Count
#   <chr>     <int>
# 1 Good       4906
# 2 Very Good 12082
# 3 Premium   13791
# 4 Ideal     21551
# 5 Fair       1610
# 6 E          9797
# 7 F          9542
# 8 G         11292
# 9 D          6775
#10 H          8304
#11 J          2808
#12 I          5422

答案 1 :(得分:2)

您在Fun的第一次尝试工作,但结果已分配给Group变量且未返回。请尝试以下

Fun<-function(df){

Order<-c("Good","Very Good","Premium","Ideal","Fair","E","F","G","D","H","J","I")

Groups<-list("cut","color")

 Groups%>%
 map_df(function(group){

     syms<-syms(group)

     df%>%
         group_by(!!!syms)%>%
         summarise(Count=n())%>%
         set_names(c("Cut","Count"))%>%
         slice(match(Order,Cut))
return(df)
})
}

Fun(diamonds)

答案 2 :(得分:1)

可能纠正问题。为简单起见,我创建了一个temp_df变量并返回相同的内容。

Fun<-function(df){

  Order<-c("Good","Very Good","Premium","Ideal","Fair","E","F","G","D","H","J","I")

  Groups<-list("cut","color")

  Groups<-Groups%>%
    map_df(function(group){

      syms<-syms(group)

      temp <- df%>%
        group_by(!!!syms)%>%
        summarise(Count=n())%>%
        set_names(c("Cut","Count"))
    })

  temp_df <- Groups%>%slice(match(Order, Cut))
  return(temp_df)
}

> x <- Fun(diamonds)
> x
# A tibble: 12 x 2
   Cut       Count
   <chr>     <int>
 1 Good       4906
 2 Very Good 12082
 3 Premium   13791
 4 Ideal     21551
 5 Fair       1610
 6 E          9797
 7 F          9542
 8 G         11292
 9 D          6775
10 H          8304
11 J          2808
12 I          5422