Question

我在使用R中的collect函数时遇到麻烦。这是示例数据帧-

library(dplyr)
library(tidyr)
DF = data.frame(Region = c("Asia", "Asia", "Asia", "Europe", "Europe"),
                `Indicator Name` = c("Population", "GDP", "GNI", "Population", "GDP"),
                `2004` = c(22, 33,44,55,56),
                `2005` =c(223, 44,555,66,64))

Region Indicator.Name X2004 X2005
1   Asia     Population    22   223
2   Asia            GDP    33    44
3   Asia            GNI    44   555
4 Europe     Population    55    66
5 Europe            GDP    56    64

这是我想要的数据框


DF2 = data.frame(Region = c("Asia", "Asia", "Europe", "Europe"),
                 Year =  c("X2004", "X2005"),
                 population = c(22, 224, 55, 66),
                 GDP = c(33, 44, 56,64))

Region  Year population GDP
1   Asia X2004         22  33
2   Asia X2005        224  44
3 Europe X2004         55  56
4 Europe X2005         66  64

我想通过gather中的tidyr函数来执行此操作。我不确定该怎么做。这就是我尝试过的-

gather(DF, key= DF$Indicator.Name, values = "values")

Answer 1

这不是简单的gather函数。首先，您需要使数据框变长，然后使其宽以切换所需的列。
这是使用新的pivot_longer和pivot_wider函数的解决方案。

library(dplyr)
library(tidyr)

DF = data.frame(Region = c("Asia", "Asia", "Asia", "Europe", "Europe"),
                `Indicator Name` = c("Population", "GDP", "GNI", "Population", "GDP"),
                `2004` = c(22, 33,44,55,56),
                `2005` =c(223, 44,555,66,64))



DF %>% pivot_longer(cols = starts_with("x")) %>% 
       pivot_wider(names_from = Indicator.Name, values_from = value) 

# A tibble: 4 x 5
Region name  Population   GDP   GNI
<fct>  <chr>      <dbl> <dbl> <dbl>
1 Asia   X2004         22    33    44
2 Asia   X2005        223    44   555
3 Europe X2004         55    56    NA
4 Europe X2005         66    64    NA

Answer 2

使用gather和spread，您将拥有：

DF %>% 
  gather(-Indicator.Name, -Region, key= "Year", value = "value") %>%
  spread(Indicator.Name, value)

在Tidyr中使用“聚集”功能时遇到问题

2 个答案: