Purrr融合了关于安排日期列

时间:2016-08-28 20:26:46

标签: r purrr

我试图使用purrr安排列表列。但仅仅创造一个玩具的例子让我完全糊涂了:

s <- tibble(b = as.integer(runif(
  n = 10, min = 0, max = 20
)))
s$e <-
  map(s$b,  ~ sample(seq(
    as.Date('1990/01/01'), as.Date('2010/01/01'), by = "day"
  ), size = .))

我以为我可以这样做:

s2 <- s %>% map('b') %>% 
  mutate(e = map(~ sample(seq(as.Date('1990/01/01'),
                              as.Date('2010/01/01'), by = "day"),
                          size = .)))

然而,这不起作用。我在这里缺少什么?

现在,我想按升序排列列表中的日期并提取第一个和最后一个日期。我怎样才能以咕噜咕噜的方式做到这一点?

我试过不同的变化
s %>% map('e') %>% map_df(~arrange(.))

但显然我在这里遗漏了一些东西......

我想要的输出是数据框s中的新列表列,其中列表列s$e中的未安排日期按升序排列在新列表列s$new_arranged_dates中。

> s
# A tibble: 10 × 3
       b           e       new_arranged_dates    
   <int>      <list>            <list>    
1     15 <date [15]>           <date [15]>
2      0  <date [0]>           <date [0]>
3      7  <date [7]>             etc
4      6  <date [6]>
5      3  <date [3]>
6     14 <date [14]>
7     15 <date [15]>
8     13 <date [13]>
9     13 <date [13]>
10    11 <date [11]>

编辑290817:

s2 <- s %>% 
  mutate(e = map(b,~ sample(seq(as.Date('1990/01/01'),
                              as.Date('2010/01/01'), by = "day"),
                          size = .))) %>% mutate(new_arranged_dates =map(e,~.[order(.)]))

获取我想要的东西。但是,我不明白为什么

s2 <- s %>% 
  mutate(e = map(b,~ sample(seq(as.Date('1990/01/01'),
                              as.Date('2010/01/01'), by = "day"),
                          size = .))) %>% mutate(new_arranged_dates=map(e,~arrange(.)))

结果

Error in mutate_(.data, .dots = lazyeval::lazy_dots(...)) : 
  argument ".data" is missing, with no default

2 个答案:

答案 0 :(得分:1)

现在这是一个老问题,但您需要的只是sort

s <- s %>% mutate(new_arranged_dates = map(e, sort))

str(s)

## Classes ‘tbl_df’, ‘tbl’ and 'data.frame':    10 obs. of  3 variables:
##  $ b                 : int  5 16 3 14 16 5 14 1 1 5
##  $ e                 :List of 10
##   ..$ : Date, format: "1991-09-28" "2006-09-12" "1993-03-04" ...
##   ..$ : Date, format: "2000-04-30" "2002-05-16" "1991-10-01" ...
##   ..$ : Date, format: "1998-04-20" "2006-12-16" "2000-10-15"
##   ..$ : Date, format: "2000-02-14" "1993-01-20" "1998-03-26" ...
##   ..$ : Date, format: "1992-07-06" "1995-08-18" "2005-01-24" ...
##   ..$ : Date, format: "1996-05-01" "1993-03-01" "2001-10-11" ...
##   ..$ : Date, format: "2006-04-24" "2008-03-26" "2007-12-08" ...
##   ..$ : Date, format: "2007-04-15"
##   ..$ : Date, format: "1998-07-16"
##   ..$ : Date, format: "2004-04-25" "1994-12-01" "1998-12-21" ...
##  $ new_arranged_dates:List of 10
##   ..$ : Date, format: "1991-09-28" "1993-03-04" "2005-02-15" ...
##   ..$ : Date, format: "1990-08-19" "1991-10-01" "1992-12-15" ...
##   ..$ : Date, format: "1998-04-20" "2000-10-15" "2006-12-16"
##   ..$ : Date, format: "1990-01-21" "1990-12-29" "1992-06-09" ...
##   ..$ : Date, format: "1992-02-12" "1992-07-06" "1993-04-30" ...
##   ..$ : Date, format: "1991-07-30" "1993-03-01" "1996-05-01" ...
##   ..$ : Date, format: "1990-12-05" "1993-08-23" "1994-12-09" ...
##   ..$ : Date, format: "2007-04-15"
##   ..$ : Date, format: "1998-07-16"
##   ..$ : Date, format: "1994-12-01" "1998-12-21" "2004-04-25" ...
##  - attr(*, "vars")= chr 

要提取最早和最晚的日期map minmax

s %>% mutate(earliest = map(e, min), 
             latest = map(e, max)) %>% 
    unnest(earliest, latest, .drop = FALSE)

## # A tibble: 10 × 5
##        b           e new_arranged_dates   earliest     latest
##    <int>      <list>             <list>     <date>     <date>
## 1      5  <date [5]>         <date [5]> 1991-09-28 2007-07-19
## 2     16 <date [16]>        <date [16]> 1990-08-19 2007-10-08
## 3      3  <date [3]>         <date [3]> 1998-04-20 2006-12-16
## 4     14 <date [14]>        <date [14]> 1990-01-21 2006-06-11
## 5     16 <date [16]>        <date [16]> 1992-02-12 2008-12-18
## 6      5  <date [5]>         <date [5]> 1991-07-30 2007-10-23
## 7     14 <date [14]>        <date [14]> 1990-12-05 2009-04-11
## 8      1  <date [1]>         <date [1]> 2007-04-15 2007-04-15
## 9      1  <date [1]>         <date [1]> 1998-07-16 1998-07-16
## 10     5  <date [5]>         <date [5]> 1994-12-01 2008-01-10

没有map_date格式会自动简化到日期,因此您必须使用unnest来简化。 .drop = FALSE指定保留其他列表列。

答案 1 :(得分:0)

所以这里的基本错误是,安排更喜欢数据帧而不会命令向量。将循环列表强制转换为data_frame解决了问题,但我花了一些时间才弄清楚生成的强制data_frame列的名称也是如此。

这样可行:

  library(dplyr)
  s <- tibble(b = as.integer(runif(
       n = 10, min = 0, max = 20
       )))
  s <-
  s %>% mutate(e = map(b,  ~ sample(seq(
    as.Date('1990/01/01'), as.Date('2010/01/01'), by = "day"
  ), size = .)))

  s <- s2 %>% mutate(arranged = map(e,  ~ arrange(data_frame(.), .)))

提示:使用从map调用的browser()语句创建一个新函数有很多帮助,也可能对其他人有帮助。