如何使用magrittr从数据框中提取单个元素?

时间:2017-07-31 12:55:06

标签: r dplyr purrr magrittr

这可能是一个简单的问题,但我无法弄清楚答案。考虑这个简单的数据框

library(dplyr)
library(purrr)
library(magrittr)
dataframe <- data_frame(id = c(1,2,3,4),
                        text = c("this is a this", "this is another",'hello','what???'))

> dataframe
# A tibble: 4 x 2
     id            text
  <dbl>           <chr>
1     1  this is a this
2     2 this is another
3     3           hello
4     4         what???

这里我想写一个管道表达式,它提取第4行和列文本中的元素:what???

我尝试使用

dataframe %>% pull(text)[[4]]

但它不起作用。我能在这做什么?

4 个答案:

答案 0 :(得分:3)

你可以尝试:

dataframe %>%
  filter(row_number() == 4) %>%
  pull(text)

答案 1 :(得分:3)

这有效:

dataframe %>% select(text) %>% unlist() %>% .[4]

编辑:

这并不重要,但有更快的选择(来自Moody&#39;)

microbenchmark(
  dataframe %$% text[4],
  dataframe %>% {.$text[4]},
  dataframe %>% .[[4,"text"]],
  dataframe %>% `[[`(4,"text"),
  dataframe %>% extract2(4,"text"),
  dataframe %$% text %>% extract(4),
  dataframe %>% extract2("text") %>% extract(4),
  dataframe %>% use_series(text) %>% extract(4),
  dataframe %>% pull(text) %>% .[4], # @andrey-kolyadin in the comments
  dataframe %>% select(text) %>% unlist() %>% .[4], # @stackTon's solution
  dataframe %>% filter(row_number() == 4) %>% pull(text) # Aramis7d's solution
  )

Unit: microseconds
                                                  expr      min        lq       mean    median        uq      max neval
                                 dataframe %$% text[4]   49.014   58.0065   74.18069   66.8210   76.5185  256.353   100
                       dataframe %>% {     .$text[4] }   92.739  102.7880  119.06888  112.6615  124.1220  290.205   100
                          dataframe %>% .[[4, "text"]]   65.235   70.5240   90.02727   79.5155   92.9155  344.507   100
                             dataframe %>% 4[["text"]]   69.466   76.8710   93.45829   85.6865  101.0250  224.618   100
                     dataframe %>% extract2(4, "text")   68.761   77.4005   90.49983   82.6890   99.6150  166.789   100
                     dataframe %$% text %>% extract(4)   81.455   87.6255  108.64541   99.9675  116.3640  332.519   100
         dataframe %>% extract2("text") %>% extract(4)   98.733  106.8440  120.75439  114.6010  125.3560  256.000   100
         dataframe %>% use_series(text) %>% extract(4)  137.521  147.3940  165.11001  156.7390  172.0780  409.741   100
                     dataframe %>% pull(text) %>% .[4] 1984.177 2042.0055 2189.99915 2076.0335 2172.6505 5512.815   100
      dataframe %>% select(text) %>% unlist() %>% .[4] 3241.256 3362.9095 3644.73124 3425.4990 3567.9555 8855.978   100
dataframe %>% filter(row_number() == 4) %>% pull(text) 3542.039 3635.4820 3941.44085 3767.7140 3980.3415 8704.705   100

我喜欢(不在列表中):

dataframe %>% .$text %>% .[4]

平均值162

答案 2 :(得分:1)

对于仅magrittr的解决方案,您需要

dataframe %>% magrittr::use_series(text) %>% magrittr::extract(4)

答案 3 :(得分:1)

一些简短的可能性:

dataframe %$% text[4]
dataframe %>% {.$text[4]}
dataframe %>% .[[4,"text"]]
dataframe %>% `[[`(4,"text")

或者,如果您只想使用magrittr别名:

dataframe %>% extract2(4,"text")
dataframe %$% text %>% extract(4)
dataframe %>% extract2("text") %>% extract(4)
dataframe %>% use_series(text) %>% extract(4) # @Brian'ssolution

其他提议的解决方案不纯magrittr(使用dplyr):

dataframe %>% pull(text) %>% .[4] # @andrey-kolyadin in the comments
dataframe %>% select(text) %>% unlist() %>% .[4] # @stackTon's solution
dataframe %>% filter(row_number() == 4) %>% pull(text) # Aramis7d's solution