当我想估计R代码的运行时间时,我使用函数system.time()
。
library(dplyr)
system.time({
Titanic %>%
as.data.frame() %>%
mutate(Dataset = 1) %>%
bind_rows(as.data.frame(Titanic)) %>%
mutate_all(funs(replace_na(., NA))) %>%
filter(Dataset != 1)
})
# utilisateur système écoulé
# 0.02 0.00 0.02
问题:
有没有一种方法可以知道每个操作的运行时间,每个管道之间的操作(mutate
,然后是bind_rows
,然后是filter
等),而不必一个个地运行还是不写几个system.time()
?
在此示例中,它没有用,但是有时我会收到一个长脚本,并且运行时间很长,并且我想确定哪些操作是最低的。
我做了一些研究,但没有发现有用的东西。
答案 0 :(得分:4)
您可以使用软件包profvis
:
library(tidyverse)
library(profvis)
profvis({
Titanic %>%
as.data.frame() %>%
mutate(Dataset = 1) %>%
bind_rows(as.data.frame(Titanic)) %>%
mutate_all(funs(replace_na(., NA))) %>%
filter(Dataset != 1)
})
答案 1 :(得分:2)
这是一个对我有用的选项(由于乐趣已被弃用,因此可以编辑NA替代品)...诚然,这很长:
library(dplyr)
library(magrittr)
library(tictoc)
Titanic %T>%
{tic("as.data.frame")} %>%
as.data.frame() %T>%
{toc(); tic("mutate")} %>%
mutate(Dataset = 1) %T>%
{toc(); tic("bind.rows")} %>%
bind_rows(as.data.frame(Titanic)) %T>%
{toc(); tic("replace.na")} %>%
replace(is.na(.), 0) %T>%
{toc(); tic("filter")} %>%
filter(Dataset != 1) %T>%
{toc(); tic("head")} %>%
head() %T>%
{toc()}
as.data.frame: 0 sec elapsed
mutate: 0 sec elapsed
bind.rows: 0 sec elapsed
replace.na: 0 sec elapsed
filter: 0 sec elapsed
head: 0 sec elapsed
Class Sex Age Survived Freq Dataset
1 1st Male Child No 0 0
2 2nd Male Child No 0 0
3 3rd Male Child No 35 0
4 Crew Male Child No 0 0
5 1st Female Child No 0 0
6 2nd Female Child No 0 0
答案 2 :(得分:1)
您可能对我的软件包 pipes 中的%L>%
管道感兴趣:
# devtools::install_github("moodymudskipper/pipes")
library(pipes)
Titanic %L>%
as.data.frame() %L>%
mutate(Dataset = 1) %L>%
bind_rows(as.data.frame(Titanic)) %L>%
mutate_all(list(~replace_na(., NA))) %L>%
filter(Dataset != 1)
# as.data.frame(.) ~ 0.03 sec
# mutate(., Dataset = 1) ~ 0 sec
# bind_rows(., as.data.frame(Titanic)) ~ 0 sec
# mutate_all(., list(~replace_na(., NA))) ~ 0 sec
# filter(., Dataset != 1) ~ 0.03 sec
# [1] Class Sex Age Survived Freq Dataset
# <0 rows> (or 0-length row.names)