运行ivot_wider后处理NA

时间:2020-02-09 14:10:24

标签: r

我有一个较长的数据框,我想使用pivot_wider进行扩展:

library(tidyr)
example_data <- data.frame(
    name = c("bob", "bob", "dick", "dick", "harry", "harry"), 
    sport = c("baseball", "football", "hockey", "basketball", "football", "basketball")
)
pivot_wider(example_data, names_from = sport, values_from = sport)

这给出了预期的结果,但是有很多NA s

  name  baseball football hockey basketball
1 bob   baseball football NA     NA        
2 dick  NA       NA       hockey basketball
3 harry NA       football NA     basketball

我想将运动名称转换为TRUE(因为运动名称已由列名称指示),并将NA s转换为FALSE,创建一个数据框像这样:

   name baseball football hockey basketball
1   bob     TRUE     TRUE  FALSE      FALSE
2  dick    FALSE    FALSE   TRUE       TRUE
3 harry    FALSE     TRUE  FALSE       TRUE

我以为这段代码可以解决问题,但是却引发了错误:

pivot_wider(
    example_data, 
    names_from = sport, 
    values_from = sport,
    values_fill = list(sport = FALSE),
    values_fn = list(sport = !is.na)
)
Error in !is.na : invalid argument type

下面的代码使我得到了与我所寻找的相反的东西,然后我可以将其转换为所需的数据帧:

pivot_wider(
    example_data, 
    names_from = sport, 
    values_from = sport,
    values_fill = list(sport = TRUE),
    values_fn = list(sport = is.na)
)

是否有一种直接获取所需数据框的方法?还有关于如何使用values_fn参数的教程,以便我弄清楚为什么values_fn = list(sport = !is.na)不起作用吗?谢谢。

1 个答案:

答案 0 :(得分:1)

一种方法是使用TRUE值创建一个虚拟列,然后使用pivot_wider

library(dplyr)
library(tidyr)

example_data %>%
  mutate(val = TRUE) %>%
  pivot_wider(names_from = sport,values_from = val,values_fill = list(val = FALSE))


# A tibble: 3 x 5
#  name  baseball football hockey basketball
#  <fct> <lgl>    <lgl>    <lgl>  <lgl>     
#1 bob   TRUE     TRUE     FALSE  FALSE     
#2 dick  FALSE    FALSE    TRUE   TRUE      
#3 harry FALSE    TRUE     FALSE  TRUE