我试图使用purrr安排列表列。但仅仅创造一个玩具的例子让我完全糊涂了:
s <- tibble(b = as.integer(runif(
n = 10, min = 0, max = 20
)))
s$e <-
map(s$b, ~ sample(seq(
as.Date('1990/01/01'), as.Date('2010/01/01'), by = "day"
), size = .))
我以为我可以这样做:
s2 <- s %>% map('b') %>%
mutate(e = map(~ sample(seq(as.Date('1990/01/01'),
as.Date('2010/01/01'), by = "day"),
size = .)))
然而,这不起作用。我在这里缺少什么?
现在,我想按升序排列列表中的日期并提取第一个和最后一个日期。我怎样才能以咕噜咕噜的方式做到这一点?
我试过不同的变化s %>% map('e') %>% map_df(~arrange(.))
但显然我在这里遗漏了一些东西......
我想要的输出是数据框s
中的新列表列,其中列表列s$e
中的未安排日期按升序排列在新列表列s$new_arranged_dates
中。
> s
# A tibble: 10 × 3
b e new_arranged_dates
<int> <list> <list>
1 15 <date [15]> <date [15]>
2 0 <date [0]> <date [0]>
3 7 <date [7]> etc
4 6 <date [6]>
5 3 <date [3]>
6 14 <date [14]>
7 15 <date [15]>
8 13 <date [13]>
9 13 <date [13]>
10 11 <date [11]>
编辑290817:
s2 <- s %>%
mutate(e = map(b,~ sample(seq(as.Date('1990/01/01'),
as.Date('2010/01/01'), by = "day"),
size = .))) %>% mutate(new_arranged_dates =map(e,~.[order(.)]))
获取我想要的东西。但是,我不明白为什么
s2 <- s %>%
mutate(e = map(b,~ sample(seq(as.Date('1990/01/01'),
as.Date('2010/01/01'), by = "day"),
size = .))) %>% mutate(new_arranged_dates=map(e,~arrange(.)))
结果
Error in mutate_(.data, .dots = lazyeval::lazy_dots(...)) :
argument ".data" is missing, with no default
答案 0 :(得分:1)
现在这是一个老问题,但您需要的只是sort
:
s <- s %>% mutate(new_arranged_dates = map(e, sort))
str(s)
## Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 10 obs. of 3 variables:
## $ b : int 5 16 3 14 16 5 14 1 1 5
## $ e :List of 10
## ..$ : Date, format: "1991-09-28" "2006-09-12" "1993-03-04" ...
## ..$ : Date, format: "2000-04-30" "2002-05-16" "1991-10-01" ...
## ..$ : Date, format: "1998-04-20" "2006-12-16" "2000-10-15"
## ..$ : Date, format: "2000-02-14" "1993-01-20" "1998-03-26" ...
## ..$ : Date, format: "1992-07-06" "1995-08-18" "2005-01-24" ...
## ..$ : Date, format: "1996-05-01" "1993-03-01" "2001-10-11" ...
## ..$ : Date, format: "2006-04-24" "2008-03-26" "2007-12-08" ...
## ..$ : Date, format: "2007-04-15"
## ..$ : Date, format: "1998-07-16"
## ..$ : Date, format: "2004-04-25" "1994-12-01" "1998-12-21" ...
## $ new_arranged_dates:List of 10
## ..$ : Date, format: "1991-09-28" "1993-03-04" "2005-02-15" ...
## ..$ : Date, format: "1990-08-19" "1991-10-01" "1992-12-15" ...
## ..$ : Date, format: "1998-04-20" "2000-10-15" "2006-12-16"
## ..$ : Date, format: "1990-01-21" "1990-12-29" "1992-06-09" ...
## ..$ : Date, format: "1992-02-12" "1992-07-06" "1993-04-30" ...
## ..$ : Date, format: "1991-07-30" "1993-03-01" "1996-05-01" ...
## ..$ : Date, format: "1990-12-05" "1993-08-23" "1994-12-09" ...
## ..$ : Date, format: "2007-04-15"
## ..$ : Date, format: "1998-07-16"
## ..$ : Date, format: "1994-12-01" "1998-12-21" "2004-04-25" ...
## - attr(*, "vars")= chr
要提取最早和最晚的日期map
min
和max
:
s %>% mutate(earliest = map(e, min),
latest = map(e, max)) %>%
unnest(earliest, latest, .drop = FALSE)
## # A tibble: 10 × 5
## b e new_arranged_dates earliest latest
## <int> <list> <list> <date> <date>
## 1 5 <date [5]> <date [5]> 1991-09-28 2007-07-19
## 2 16 <date [16]> <date [16]> 1990-08-19 2007-10-08
## 3 3 <date [3]> <date [3]> 1998-04-20 2006-12-16
## 4 14 <date [14]> <date [14]> 1990-01-21 2006-06-11
## 5 16 <date [16]> <date [16]> 1992-02-12 2008-12-18
## 6 5 <date [5]> <date [5]> 1991-07-30 2007-10-23
## 7 14 <date [14]> <date [14]> 1990-12-05 2009-04-11
## 8 1 <date [1]> <date [1]> 2007-04-15 2007-04-15
## 9 1 <date [1]> <date [1]> 1998-07-16 1998-07-16
## 10 5 <date [5]> <date [5]> 1994-12-01 2008-01-10
没有map_date
格式会自动简化到日期,因此您必须使用unnest
来简化。 .drop = FALSE
指定保留其他列表列。
答案 1 :(得分:0)
所以这里的基本错误是,安排更喜欢数据帧而不会命令向量。将循环列表强制转换为data_frame解决了问题,但我花了一些时间才弄清楚生成的强制data_frame列的名称也是如此。
这样可行:
library(dplyr)
s <- tibble(b = as.integer(runif(
n = 10, min = 0, max = 20
)))
s <-
s %>% mutate(e = map(b, ~ sample(seq(
as.Date('1990/01/01'), as.Date('2010/01/01'), by = "day"
), size = .)))
s <- s2 %>% mutate(arranged = map(e, ~ arrange(data_frame(.), .)))
提示:使用从map调用的browser()语句创建一个新函数有很多帮助,也可能对其他人有帮助。