遇到tidyr collect()的麻烦

时间:2018-08-30 23:18:58

标签: r tidyr

嗨,我正在尝试从Re_NM_df创建一个长数据框

head(Re_NM_df, n = 3)
AustinPulse.Remain AustinPulse.NM1_SpacePump AustinPulse.NM4_Nothing
1                 NA                         0                       0
2                 NA                         0                       0
3                 NA                         0                       0

但是,出现此错误

RE_NM_Long <- gather(data = Re_NM_df, "NurseMom", "Likely", AustinPulse$Remain, 
AustinPulse$NM1_SpacePump)
Error: `AustinPulse$Remain` must evaluate to column positions or names, not an 
integer vector
In addition: Warning message:
'glue::collapse' is deprecated.
Use 'glue_collapse' instead.
See help("Deprecated") and help("glue-deprecated")

这是我预期的输出。

     NursingMom                     Remain                    Available
1    NM1_Space                         1                       1
2    NM1_Space                         4                       0
3    NM4_Nothing                       2                       1*

*受访者选择了什么都没有。

我似乎无法很好地编写collect函数。任何帮助表示赞赏。

谢谢。

1 个答案:

答案 0 :(得分:0)

感谢您的提问。看到您希望最终数据框架看起来像的示例,将是很棒的。为了帮助您,下面我使用tidyr::gather创建了一个可复制的示例,将您的示例数据转换为“长”数据框。

使用gather时要记住的关键是指定形成“键”和“值”对的列名。您还可以选择和排除要收集为附加参数的列。

library(tibble)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)

x <- tribble(
  ~AustinPulse.Remain, ~AustinPulse.NM1_SpacePump, ~AustinPulse.NM4_Nothing,
                 NA,                         0,                       0,
                 NA,                         0,                       0,
                 NA,                         0,                       0
)

# gathering all columns
x %>% 
  tidyr::gather(attribute, value)
#> # A tibble: 9 x 2
#>   attribute                 value
#>   <chr>                     <dbl>
#> 1 AustinPulse.Remain          NA 
#> 2 AustinPulse.Remain          NA 
#> 3 AustinPulse.Remain          NA 
#> 4 AustinPulse.NM1_SpacePump    0.
#> 5 AustinPulse.NM1_SpacePump    0.
#> 6 AustinPulse.NM1_SpacePump    0.
#> 7 AustinPulse.NM4_Nothing      0.
#> 8 AustinPulse.NM4_Nothing      0.
#> 9 AustinPulse.NM4_Nothing      0.

    # here I have excluded AustinPulse.Remain from the gather
x %>% 
  tidyr::gather(NursingMom, value, -AustinPulse.Remain) %>% 
  dplyr::select(NursingMom, value, AustinPulse.Remain)
#> # A tibble: 6 x 3
#>   NursingMom                value AustinPulse.Remain
#>   <chr>                     <dbl> <lgl>             
#> 1 AustinPulse.NM1_SpacePump    0. NA                
#> 2 AustinPulse.NM1_SpacePump    0. NA                
#> 3 AustinPulse.NM1_SpacePump    0. NA                
#> 4 AustinPulse.NM4_Nothing      0. NA                
#> 5 AustinPulse.NM4_Nothing      0. NA                
#> 6 AustinPulse.NM4_Nothing      0. NA