在drake

时间:2019-12-18 21:33:54

标签: r drake-r-package

我真的很喜欢在构建Drake计划时使用code_to_plan函数。我也确实将target(..., format = "fst")用于大文件。但是,我正在努力将这两个工作流程结合在一起。例如,如果我有这个_drake.R文件:

# Data --------------------------------------------------------------------

data_plan = code_to_plan("code/01-data/data.R")
join_plan = code_to_plan("code/01-data/merging.R")


# Cleaning ----------------------------------------------------------------

cleaning_plan = code_to_plan("code/02-cleaning/remove_na.R")


# Model -------------------------------------------------------------------

model_plan = code_to_plan("code/03-model/model.R")


# Combine Plans
dplan = bind_plans(
  data_plan,
  join_plan,
  cleaning_plan,
  model_plan
  )

config <- drake_config(dplan)

使用r_make(r_args = list(show = TRUE))

调用时效果很好

据我了解,target仅可在drake_plan中使用。如果我尝试这样的事情:

dplan2 <- drake_plan(full_plan = target(dplan, format = "fst"))
config <- drake_config(dplan2)

我收到这样的r_make错误:

  

目标完整计划   fst :: write_fst(x = value $ value,path = tmp)中的错误:     在列中找到未知类型。   另外:警告消息:   您为目标full_plan选择了fst格式,因此drake会将其从类c(“ drake_plan”,“ tbl_df”,“ tbl”,“ data.frame”)转换为纯数据帧。

     

错误:   ->      在过程18712

     

有关堆栈跟踪,请参见.Last.error.trace

因此,最终我的问题是,在使用code_to_plan时,在哪里指定special data formats for targets

编辑

使用@landau有用的建议,我定义了此功能:

add_target_format <- function(plan) {

  # Get a list of named commands.
  commands <- plan$command
  names(commands) <- plan$target

  # Turn it into a good plan.
  do.call(drake_plan, commands)

}

这样可以正常工作:

dplan = bind_plans(
  data_plan,
  join_plan,
  cleaning_plan,
  model_plan
  ) %>%
  add_target_format()

1 个答案:

答案 0 :(得分:1)

可能,但不方便。这是一种解决方法。

writeLines(
  c(
    "x <- small_data()",
    "y <- target(large_data(), format = \"fst\")"
  ),
  "script.R"
)

cat(readLines("script.R"), sep = "\n")
#> x <- small_data()
#> y <- target(large_data(), format = "fst")

library(drake)

# Produces a plan, but does not process target().
bad_plan <- code_to_plan("script.R")
bad_plan
#> # A tibble: 2 x 2
#>   target command                             
#>   <chr>  <expr>                              
#> 1 x      small_data()                        
#> 2 y      target(large_data(), format = "fst")

# Get a list of named commands.
commands <- bad_plan$command
names(commands) <- bad_plan$target

# Turn it into a good plan.
good_plan <- do.call(drake_plan, commands)
good_plan
#> # A tibble: 2 x 3
#>   target command      format
#>   <chr>  <expr>       <chr> 
#> 1 x      small_data() <NA>  
#> 2 y      large_data() fst

reprex package(v0.3.0)于2019-12-18创建