用pivot_wider()替换spread()

时间:2020-01-21 22:28:55

标签: r dplyr

我有一个正在使用dplyr软件包在R中进行编辑的数据集。我的代码是:

    hiphop%>%
  mutate( sex = 
    case_when(
      sex == 1 ~ "female",
      sex == 0 ~ "male"
    )
  )%>%
  group_by(sex)%>%
  summarise_at(vars(intl,vocal,classical,folk,rock,country,pop,alternative,hiphop,unclassifiable),funs(mean))%>%
  pivot_longer(c(intl,vocal,classical,folk,rock,country,pop,alternative,hiphop,unclassifiable),names_to = "genre")%>%
spread(sex,value)%>%
  mutate(
    genredifference = abs(female-male)
  )%>%
  arrange(genredifference)%>%
  top_n(3)

我在哪里获得此输出:

Selecting by genredifference
# A tibble: 3 x 4
  genre   female  male genredifference
  <chr>    <dbl> <dbl>           <dbl>
1 country  0.786 0.392           0.394
2 vocal    0.880 1.57            0.688
3 rock     1.93  3.06            1.13 

我想获得相同的输出,但可以通过将pread()函数替换为pivot_wider()来实现(我相信这将是要使用的函数)。但是,我不知道该怎么做。

谢谢!

P.S:如果您有兴趣,这是我的数据集:

hiphop <- read_csv("https://www.dropbox.com/s/5d8fwxrj3jtua1z/hiphop.csv?dl=1")

1 个答案:

答案 0 :(得分:1)

基于保管箱输入数据,某些步骤已经完成。我们可以利用select_helpers使某些步骤更紧凑,即,如果我们有一定范围的列可供选择,请使用:,与pivot_longer类似,我们也可以指定列不为由-选择。使用pivot_wider时,请确保指定自变量(names_fromvalues_from,因为还有其他自变量,并且不指定自变量,它可以按出现的顺序匹配自变量

library(dplyr)
library(tidyr)
 hiphop %>%  
    group_by(sex)%>%
    summarise_at(vars(intl:unclassifiable), mean) %>%
    pivot_longer(cols = -sex) %>% 
    pivot_wider(names_from = sex, values_from = value) %>%
    mutate(genredifference = abs(Female-Male))%>%
    arrange(genredifference)%>%
    top_n(3)
# A tibble: 3 x 4
#  name    Female  Male genredifference
#  <chr>    <dbl> <dbl>           <dbl>
#1 country  0.786 0.392           0.394
#2 vocal    0.880 1.57            0.688
#3 rock     1.93  3.06            1.13