Pivot_wider /价差而不是value_from或值仅1?

时间:2019-11-11 23:14:42

标签: r dplyr tidyr

例如,如果是/否,我想采用一项功能并将其值作为1/0的列传播。

mtcars %>% 
  pivot_wider(names_from = cyl,
              values_from = 1)

这似乎已经完成了一些工作,现在cyl已分散到列中,除了值是21、21.4或NA之类。

> mtcars %>% 
+   pivot_wider(names_from = cyl,
+               values_from = 1)
# A tibble: 32 x 12
    disp    hp  drat    wt  qsec    vs    am  gear  carb   `6`   `4`   `8`
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1  160    110  3.9   2.62  16.5     0     1     4     4  21    NA    NA  
 2  160    110  3.9   2.88  17.0     0     1     4     4  21    NA    NA  
 3  108     93  3.85  2.32  18.6     1     1     4     1  NA    22.8  NA  
 4  258    110  3.08  3.22  19.4     1     0     3     1  21.4  NA    NA  
 5  360    175  3.15  3.44  17.0     0     0     3     2  NA    NA    18.7
 6  225    105  2.76  3.46  20.2     1     0     3     1  18.1  NA    NA  
 7  360    245  3.21  3.57  15.8     0     0     3     4  NA    NA    14.3
 8  147.    62  3.69  3.19  20       1     0     4     2  NA    24.4  NA  
 9  141.    95  3.92  3.15  22.9     1     0     4     2  NA    22.8  NA  
10  168.   123  3.92  3.44  18.3     1     0     4     4  19.2  NA    NA 

我尝试像这样使用values_fill

> mtcars %>% 
+   pivot_wider(names_from = cyl,
+               values_from = 1,
+               values_fill = list(1 = 0))
Error: unexpected '=' in:
"              values_from = 1,
              values_fill = list(1 ="

如何根据柱面是4、6还是8来将柱面分布在具有二进制1或0值的列上?

pivot_wider()是我想要的吗?

2 个答案:

答案 0 :(得分:1)

mpg设置为1,并将mpg的填充设置为0,如下所示:

mtcars %>%
  mutate(mpg = 1) %>%
  pivot_wider(names_from = cyl, values_from = mpg, values_fill = list(mpg = 0))
## # A tibble: 32 x 12
##     disp    hp  drat    wt  qsec    vs    am  gear  carb   `6`   `4`   `8`
##    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
##  1  160    110  3.9   2.62  16.5     0     1     4     4     1     0     0
##  2  160    110  3.9   2.88  17.0     0     1     4     4     1     0     0
##  3  108     93  3.85  2.32  18.6     1     1     4     1     0     1     0
## ... etc ...

或鉴于pivot_wider当前在排序列方面存在问题,您可能更喜欢较旧的spread

mtcars %>%
  mutate(mpg = 1) %>%
  spread(cyl, mpg, fill = 0)
##     disp  hp drat    wt  qsec vs am gear carb 4 6 8
## 1   71.1  65 4.22 1.835 19.90  1  1    4    1 1 0 0
## 2   75.7  52 4.93 1.615 18.52  1  1    4    2 1 0 0
## 3   78.7  66 4.08 2.200 19.47  1  1    4    1 1 0 0
## ... etc ...

或者这样指定values_fn:

mtcars %>%
  pivot_wider(names_from = cyl, values_from = mpg, 
    values_fn = list(mpg = ~ 1), values_fill = list(mpg = 0))
## # A tibble: 32 x 12
##     disp    hp  drat    wt  qsec    vs    am  gear  carb   `6`   `4`   `8`
##    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
##  1  160    110  3.9   2.62  16.5     0     1     4     4     1     0     0
##  2  160    110  3.9   2.88  17.0     0     1     4     4     1     0     0
##  3  108     93  3.85  2.32  18.6     1     1     4     1     0     1     0
## ...etc...

答案 1 :(得分:1)

一种选择是使用cyl中的名称和值,然后根据is.na重新编码:

mtcars %>% 
  pivot_wider(names_from = cyl,
              values_from = cyl) %>% 
  mutate_at(vars(!!!syms(as.character(unique(mtcars$cyl)))), ~if_else(is.na(.), 0, 1))

# A tibble: 32 x 13
#     mpg  disp    hp  drat    wt  qsec    vs    am  gear  carb   `6`   `4`   `8`
#   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1  21    160    110  3.9   2.62  16.5     0     1     4     4     1     0     0
# 2  21    160    110  3.9   2.88  17.0     0     1     4     4     1     0     0
# 3  22.8  108     93  3.85  2.32  18.6     1     1     4     1     0     1     0
# 4  21.4  258    110  3.08  3.22  19.4     1     0     3     1     1     0     0
# 5  18.7  360    175  3.15  3.44  17.0     0     0     3     2     0     0     1
# 6  18.1  225    105  2.76  3.46  20.2     1     0     3     1     1     0     0
# 7  14.3  360    245  3.21  3.57  15.8     0     0     3     4     0     0     1
# 8  24.4  147.    62  3.69  3.19  20       1     0     4     2     0     1     0
# 9  22.8  141.    95  3.92  3.15  22.9     1     0     4     2     0     1     0
#10  19.2  168.   123  3.92  3.44  18.3     1     0     4     4     1     0     0