R-dplyr中的代码行运行时间

时间:2019-04-05 20:06:05

标签: r performance time dplyr

当我想估计R代码的运行时间时,我使用函数system.time()

library(dplyr)

system.time({
    Titanic %>%
        as.data.frame() %>%
        mutate(Dataset = 1) %>%
        bind_rows(as.data.frame(Titanic)) %>%
        mutate_all(funs(replace_na(., NA))) %>% 
        filter(Dataset != 1)
})

# utilisateur     système      écoulé 
#        0.02        0.00        0.02

问题:  有没有一种方法可以知道每个操作的运行时间,每个管道之间的操作(mutate,然后是bind_rows,然后是filter等),而不必一个个地运行还是不写几个system.time()

在此示例中,它没有用,但是有时我会收到一个长脚本,并且运行时间很长,并且我想确定哪些操作是最低的。

我做了一些研究,但没有发现有用的东西。

3 个答案:

答案 0 :(得分:4)

您可以使用软件包profvis

library(tidyverse)    
library(profvis)

profvis({
  Titanic %>%
    as.data.frame() %>%
    mutate(Dataset = 1) %>%
    bind_rows(as.data.frame(Titanic)) %>%
    mutate_all(funs(replace_na(., NA))) %>% 
    filter(Dataset != 1)
})

enter image description here

答案 1 :(得分:2)

这是一个对我有用的选项(由于乐趣已被弃用,因此可以编辑NA替代品)...诚然,这很长:

library(dplyr)
library(magrittr)
library(tictoc)

Titanic %T>%
  {tic("as.data.frame")} %>%
  as.data.frame() %T>%
  {toc(); tic("mutate")} %>%
  mutate(Dataset = 1) %T>%
  {toc(); tic("bind.rows")} %>%
  bind_rows(as.data.frame(Titanic)) %T>%
  {toc(); tic("replace.na")} %>%
  replace(is.na(.), 0) %T>% 
  {toc(); tic("filter")} %>%
  filter(Dataset != 1) %T>%
  {toc(); tic("head")} %>%
  head() %T>%
  {toc()}

as.data.frame: 0 sec elapsed
mutate: 0 sec elapsed
bind.rows: 0 sec elapsed
replace.na: 0 sec elapsed
filter: 0 sec elapsed
head: 0 sec elapsed
  Class    Sex   Age Survived Freq Dataset
1   1st   Male Child       No    0       0
2   2nd   Male Child       No    0       0
3   3rd   Male Child       No   35       0
4  Crew   Male Child       No    0       0
5   1st Female Child       No    0       0
6   2nd Female Child       No    0       0

答案 2 :(得分:1)

您可能对我的软件包 pipes 中的%L>%管道感兴趣:

# devtools::install_github("moodymudskipper/pipes")
library(pipes)
Titanic %L>%
  as.data.frame() %L>%
  mutate(Dataset = 1) %L>%
  bind_rows(as.data.frame(Titanic)) %L>%
  mutate_all(list(~replace_na(., NA))) %L>% 
  filter(Dataset != 1)

# as.data.frame(.)   ~  0.03 sec
# mutate(., Dataset = 1)   ~  0 sec
# bind_rows(., as.data.frame(Titanic))   ~  0 sec
# mutate_all(., list(~replace_na(., NA)))   ~  0 sec
# filter(., Dataset != 1)   ~  0.03 sec
# [1] Class    Sex      Age      Survived Freq     Dataset 
# <0 rows> (or 0-length row.names)