R为数据帧使用NA填充缺失值

时间:2018-06-26 14:02:09

标签: r list dataframe na

我目前正在尝试使用以下列表创建数据框

location <- list("USA","Singapore","UK")
organization <- list("Microsoft","University of London","Boeing","Apple")
person <- list()
date <- list("1989","2001","2018")
Jobs <- list("CEO","Chairman","VP of sales","General Manager","Director")

当我尝试创建数据帧时,我收到(显而易见的)错误,即列表的长度不相等。我想找到一种方法,使列表具有相同的长度,或者用“ NA”填充丢失的数据框条目。经过搜索后,我一直找不到解决方法

2 个答案:

答案 0 :(得分:1)

这里是int maxAge = context.Persons.Select(x => x.Age).DefaultIfEmpty(0).Max() (属于purrr的一部分)和基本R解决方案,假设您只想用tidyverse填充每个列表中的剩余值。我将任何列表的最大长度设为NA,然后针对每个做len的列表,计算那个列表的长度与的最大长度之间的差>任何列表。

rep(NA)

使用library(tidyverse) location <- list("USA","Singapore","UK") organization <- list("Microsoft","University of London","Boeing","Apple") person <- list() date <- list("1989","2001","2018") Jobs <- list("CEO","Chairman","VP of sales","General Manager","Director") all_lists <- list(location, organization, person, date, Jobs) len <- max(lengths(all_lists)) ,您可以映射列表列表,根据需要添加purrr::map_dfc,转换为字符向量,然后获取所有NA ed的向量的数据帧在一个管道调用中:

cbind

在基数R中,您可以在列表列表中map_dfc(all_lists, function(l) { c(l, rep(NA, len - length(l))) %>% as.character() }) #> # A tibble: 5 x 5 #> V1 V2 V3 V4 V5 #> <chr> <chr> <chr> <chr> <chr> #> 1 USA Microsoft NA 1989 CEO #> 2 Singapore University of London NA 2001 Chairman #> 3 UK Boeing NA 2018 VP of sales #> 4 NA Apple NA NA General Manager #> 5 NA NA NA NA Director 使用相同的功能,然后使用lapplyReduce生成的列表并将其转换为数据框。采取两个步骤,而不是cbind的一个步骤:

purrr

对于这两种方式,您现在都可以根据自己的喜好设置名称。

答案 1 :(得分:0)

您可以这样做:

data.frame(sapply(dyem_list, "length<-", max(lengths(dyem_list))))

   location         organization person date            Jobs
1       USA            Microsoft   NULL 1989             CEO
2 Singapore University of London   NULL 2001        Chairman
3        UK               Boeing   NULL 2018     VP of sales
4      NULL                Apple   NULL NULL General Manager
5      NULL                 NULL   NULL NULL        Director

dyem_list如下:

dyem_list <- list(
  location = list("USA","Singapore","UK"),
  organization = list("Microsoft","University of London","Boeing","Apple"),
  person = list(),
  date = list("1989","2001","2018"),
  Jobs = list("CEO","Chairman","VP of sales","General Manager","Director")
)