如何在列表列上应用函数并在dplyr和purrr中返回另一个函数?

时间:2016-10-19 18:05:02

标签: r dplyr purrr

我想在列表列中的每个向量中获取前5个值,并将其作为保存为列表的数据帧中的新列返回。

structure(list(sample_num = 1:6, vector = list(c(0, 1, 1, 0, 
1, 2, 0, 0, 3, 0), c(0, 0, 1, 2, 0, 0, 4, 10, 12, 1), c(1, 33, 
4, 4, 2, 2, 6, 9, 14, 2), c(0, 0, 1, 0, 1, 0, 1, 5, 3, 0), c(0, 
1, 1, 0, 0, 0, 1, 4, 3, 0), c(0, 0, 1, 0, 0, 0, 1, 1, 1, 0))), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -6L), .Names = c("sample_num", 
"vector"))

> test
# A tibble: 6 × 2
  sample_num     vector
       <int>     <list>
1          1 <dbl [10]>
2          2 <dbl [10]>
3          3 <dbl [10]>
4          4 <dbl [10]>
5          5 <dbl [10]>
6          6 <dbl [10]>

我尝试使用lmap但是我收到了错误消息

> test  %>% lmap(.$vector,.f = function(x) x[1:5])
Error in .f(.x[i], ...) : 
  unused argument (list(c(0, 1, 1, 0, 1, 2, 0, 0, 3, 0), c(0, 0, 1, 2, 0, 0, 4, 10, 12, 1), c(1, 33, 4, 4, 2, 2, 6, 9, 14, 2), c(0, 0, 1, 0, 1, 0, 1, 5, 3, 0), c(0, 1, 1, 0, 0, 0, 1, 4, 3, 0), c(0, 0, 1, 0, 0, 0, 1, 1, 1, 0)))

谢谢!

2 个答案:

答案 0 :(得分:3)

这是你想要做的吗?

structure(list(sample_num = 1:6, vector = list(c(0, 1, 1, 0, 
1, 2, 0, 0, 3, 0), c(0, 0, 1, 2, 0, 0, 4, 10, 12, 1), c(1, 33, 
4, 4, 2, 2, 6, 9, 14, 2), c(0, 0, 1, 0, 1, 0, 1, 5, 3, 0), c(0, 
1, 1, 0, 0, 0, 1, 4, 3, 0), c(0, 0, 1, 0, 0, 0, 1, 1, 1, 0))), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -6L), .Names = c("sample_num", 
"vector"))

test$new = lapply(test$vector, function(x) {x[1:5]})
test

# A tibble: 6 × 3
  sample_num     vector       new
       <int>     <list>    <list>
1          1 <dbl [10]> <dbl [5]>
2          2 <dbl [10]> <dbl [5]>
3          3 <dbl [10]> <dbl [5]>
4          4 <dbl [10]> <dbl [5]>
5          5 <dbl [10]> <dbl [5]>
6          6 <dbl [10]> <dbl [5]>


test$vector[3]
[[1]]
 [1]  1 33  4  4  2  2  6  9 14  2

test$new[3]
[[1]]
[1]  1 33  4  4  2

如果要使用dplyr语法,请先定义一个函数:

f = function(x) {
    return(list(x[1:5]))
    }

然后,将其逐行应用于列vector

test = test %>%
    rowwise() %>%
    mutate(new_dplyr = f(vector))

test
# A tibble: 6 × 3
  sample_num     vector new_dplyr
       <int>     <list>    <list>
1          1 <dbl [10]> <dbl [5]>
2          2 <dbl [10]> <dbl [5]>
3          3 <dbl [10]> <dbl [5]>
4          4 <dbl [10]> <dbl [5]>
5          5 <dbl [10]> <dbl [5]>
6          6 <dbl [10]> <dbl [5]>

test$vector[3]
[[1]]
 [1]  1 33  4  4  2  2  6  9 14  2

test$new_dplyr[3]
[[1]]
[1]  1 33  4  4  2

答案 1 :(得分:2)

这使用紧凑的调用序列。我第一次尝试使用&#39; [&#39;因为我会使用lapply或sapply用法,但需要反复才能使其成功:

> test$new <- map(test$vector,.f = `[`,  1:5)
> test
# A tibble: 6 × 3
  sample_num     vector       new
       <int>     <list>    <list>
1          1 <dbl [10]> <dbl [5]>
2          2 <dbl [10]> <dbl [5]>
3          3 <dbl [10]> <dbl [5]>
4          4 <dbl [10]> <dbl [5]>
5          5 <dbl [10]> <dbl [5]>
6          6 <dbl [10]> <dbl [5]>