是否有一种简单的方法来获取给定包的R包依赖项列表(所有递归依赖项),而无需安装包及其依赖项?类似于portupgrade或apt中的虚假安装。
答案 0 :(得分:31)
您可以使用available.packages
功能的结果。例如,要查看ggplot2
依赖的内容:
pack <- available.packages()
pack["ggplot2","Depends"]
给出了:
[1] "R (>= 2.14), stats, methods"
请注意,根据您希望实现的目标,您可能还需要检查Imports
字段。
答案 1 :(得分:7)
我没有安装R,我需要找出哪些R软件包依赖于我公司要求使用的R软件包列表。
我编写了一个bash脚本,它迭代文件中的R Packages列表,并递归发现依赖项。
该脚本使用名为 rinput_orig.txt 的文件作为输入(以下示例)。该脚本将在其工作时创建名为 rinput.txt 的文件。
该脚本将创建以下文件:
Bash脚本:
#!/bin/bash
# CLEANUP
rm routput.txt
rm rdepsfound.txt
rm r404.txt
# COPY ORIGINAL INPUT TO WORKING INPUT
cp rinput_orig.txt rinput.txt
IFS=","
while read PACKAGE; do
echo Processing $PACKAGE...
PACKAGEURL="http://cran.r-project.org/web/packages/${PACKAGE}/index.html"
if [ `curl -o /dev/null --silent --head --write-out '%{http_code}\n' ${PACKAGEURL}` != 404 ]; then
# GET LICENSE INFO OF PACKAGE
LICENSEINFO=$(curl ${PACKAGEURL} 2>/dev/null | grep -A1 "License:" | grep -v "License:" | gawk 'match($0, /<a href=".*">(.*)<\/a>/, a) {print a[0]}' | sed "s/|/,/g" | sed "s/+/,/g")
for x in ${LICENSEINFO[*]}
do
# SAVE LICENSE
LICENSE=$(echo ${x} | gawk 'match($0, /<a href=".*">(.*)<\/a>/, a) {print a[1]}')
break
done
# WRITE PACKAGE AND LICENSE TO OUTPUT FILE
echo $PACKAGE $LICENSE $PACKAGEURL >> routput.txt
# GET DEPENDENCIES OF PACKAGE
DEPS=$(curl ${PACKAGEURL} 2>/dev/null | grep -A1 "Depends:" | grep -v "Depends:" | gawk 'match($0, /<a href=".*">(.*)<\/a>/, a) {print a[0]}')
for x in ${DEPS[*]}
do
FOUNDDEP=$(echo "${x}" | gawk 'match($0, /<a href=".*">(.*)<\/a>/, a) {print a[1]}' | sed "s/<\/span>//g")
if [ "$FOUNDDEP" != "" ]; then
echo Found dependency $FOUNDDEP for $PACKAGE...
grep $FOUNDDEP rinput.txt > /dev/null
if [ "$?" = "0" ]; then
echo $FOUNDDEP already exists in package list...
else
echo Adding $FOUNDDEP to package list...
# SAVE FOUND DEPENDENCY BACK TO INPUT LIST
echo $FOUNDDEP >> rinput.txt
# SAVE FOUND DEPENDENCY TO DEPENDENCY LIST FOR EASY VIEWING OF ALL FOUND DEPENDENCIES
echo $FOUNDDEP is a dependency of $PACKAGE >> rdepsfound.txt
fi
fi
done
else
echo Skipping $PACKAGE because 404 was received...
echo $PACKAGE $PACKAGEURL >> r404.txt
fi
done < rinput.txt
echo -e "\nRESULT:"
sort -u routput.txt
示例rinput_orig.txt:
shiny
rmarkdown
xtable
RODBC
RJDBC
XLConnect
openxlsx
xlsx
Rcpp
运行脚本时的控制台输出示例:
Processing shiny...
Processing rmarkdown...
Processing xtable...
Processing RODBC...
Processing RJDBC...
Found dependency DBI for RJDBC...
Adding DBI to package list...
Found dependency rJava for RJDBC...
Adding rJava to package list...
Processing XLConnect...
Found dependency XLConnectJars for XLConnect...
Adding XLConnectJars to package list...
Processing openxlsx...
Processing xlsx...
Found dependency rJava for xlsx...
rJava already exists in package list...
Found dependency xlsxjars for xlsx...
Adding xlsxjars to package list...
Processing Rcpp...
Processing DBI...
Processing rJava...
Processing XLConnectJars...
Processing xlsxjars...
Found dependency rJava for xlsxjars...
rJava already exists in package list...
示例rdepsfound.txt:
DBI is a dependency of RJDBC
rJava is a dependency of RJDBC
XLConnectJars is a dependency of XLConnect
xlsxjars is a dependency of xlsx
示例routput.txt:
shiny GPL-3 http://cran.r-project.org/web/packages/shiny/index.html
rmarkdown GPL-3 http://cran.r-project.org/web/packages/rmarkdown/index.html
xtable GPL-2 http://cran.r-project.org/web/packages/xtable/index.html
RODBC GPL-2 http://cran.r-project.org/web/packages/RODBC/index.html
RJDBC GPL-2 http://cran.r-project.org/web/packages/RJDBC/index.html
XLConnect GPL-3 http://cran.r-project.org/web/packages/XLConnect/index.html
openxlsx GPL-3 http://cran.r-project.org/web/packages/openxlsx/index.html
xlsx GPL-3 http://cran.r-project.org/web/packages/xlsx/index.html
Rcpp GPL-2 http://cran.r-project.org/web/packages/Rcpp/index.html
DBI LGPL-2 http://cran.r-project.org/web/packages/DBI/index.html
rJava GPL-2 http://cran.r-project.org/web/packages/rJava/index.html
XLConnectJars GPL-3 http://cran.r-project.org/web/packages/XLConnectJars/index.html
xlsxjars GPL-3 http://cran.r-project.org/web/packages/xlsxjars/index.html
我希望这有助于某人!
答案 2 :(得分:2)
我针对 packrat
和 tools
测试了自己的解决方案。
您可以找出方法之间的明显差异。
tools::package_dependencies
看起来为旧 R 版本(直到 4.1.0 和 recursive = TRUE
)提供了过多的内容,并且不是有效的解决方案。
R 4.1.0 NEWS
"Function tools::package_dependencies() (in package tools) can now use different dependency types for direct and recursive dependencies."
packrat:::packrat:::recursivePackageDependencies
使用 available.packages
,因此它基于最新的远程包,而不是本地包。
默认情况下,我的功能是跳过基础包,如果您也想附加它们,请更改 base
参数。
在 R 4.1.0 下测试:
get_deps <- function(package, fields = c("Depends", "Imports", "LinkingTo"), base = FALSE) {
stopifnot((length(package) == 1) && is.character(package))
stopifnot(all(fields %in% c("Depends", "Imports", "Suggests", "LinkingTo")))
stopifnot(is.logical(base))
stopifnot(package %in% rownames(utils::installed.packages(lib.loc = lib.loc)))
paks_global <- NULL
deps <- function(pak, fileds) {
pks <- packageDescription(pak)
res <- NULL
for (f in fileds) {
ff <- pks[[f]]
if (!is.null(ff)) {
res <- c(
res,
vapply(
strsplit(trimws(strsplit(ff, ",")[[1]]), "[ \n\\(]"),
function(x) x[1],
character(1)
)
)
}
}
if (is.null(res)) {
return(NULL)
}
for (r in res) {
if (r != "R" && !r %in% paks_global) {
paks_global <<- c(r, paks_global)
deps(r, fields)
}
}
}
deps(package, fields)
setdiff(unique(paks_global), c(
package,
"R",
if (!base) {
c(
"stats",
"graphics",
"grDevices",
"utils",
"datasets",
"methods",
"base",
"tools"
)
} else {
NULL
}
))
}
own = get_deps("shiny", fields = c("Depends", "Imports"))
packrat = packrat:::recursivePackageDependencies("shiny", lib.loc = .libPaths(), fields = c("Depends", "Imports"))
tools = tools::package_dependencies("shiny", which = c("Depends", "Imports"), recursive = TRUE)[[1]]
setdiff(own, packrat)
#> character(0)
setdiff(packrat, own)
#> character(0)
setdiff(own, tools)
#> character(0)
setdiff(tools, own)
#> [1] "methods" "utils" "grDevices" "tools" "stats" "graphics"
setdiff(packrat, tools)
#> character(0)
setdiff(tools, packrat)
#> [1] "methods" "utils" "grDevices" "tools" "stats" "graphics"
own
#> [1] "lifecycle" "ellipsis" "cachem" "jquerylib" "rappdirs"
#> [6] "fs" "sass" "bslib" "glue" "commonmark"
#> [11] "withr" "fastmap" "crayon" "sourcetools" "base64enc"
#> [16] "htmltools" "digest" "xtable" "jsonlite" "mime"
#> [21] "magrittr" "rlang" "later" "promises" "R6"
#> [26] "Rcpp" "httpuv"
packrat
#> [1] "R6" "Rcpp" "base64enc" "bslib" "cachem"
#> [6] "commonmark" "crayon" "digest" "ellipsis" "fastmap"
#> [11] "fs" "glue" "htmltools" "httpuv" "jquerylib"
#> [16] "jsonlite" "later" "lifecycle" "magrittr" "mime"
#> [21] "promises" "rappdirs" "rlang" "sass" "sourcetools"
#> [26] "withr" "xtable"
tools
#> [1] "methods" "utils" "grDevices" "httpuv" "mime"
#> [6] "jsonlite" "xtable" "digest" "htmltools" "R6"
#> [11] "sourcetools" "later" "promises" "tools" "crayon"
#> [16] "rlang" "fastmap" "withr" "commonmark" "glue"
#> [21] "bslib" "cachem" "ellipsis" "lifecycle" "sass"
#> [26] "jquerylib" "magrittr" "base64enc" "Rcpp" "stats"
#> [31] "graphics" "fs" "rappdirs"
microbenchmark::microbenchmark(get_deps("shiny", fields = c("Depends", "Imports")),
packrat:::recursivePackageDependencies("shiny", lib.loc = .libPaths(), fields = c("Depends", "Imports")),
tools = tools::package_dependencies("shiny", which = c("Depends", "Imports"), recursive = TRUE)[[1]],
times = 5
)
#> Warning in microbenchmark::microbenchmark(get_deps("shiny", fields =
#> c("Depends", : less accurate nanosecond times to avoid potential integer
#> overflows
#> Unit: milliseconds
#> expr
#> get_deps("shiny", fields = c("Depends", "Imports"))
#> packrat:::recursivePackageDependencies("shiny", lib.loc = .libPaths(), fields = c("Depends", "Imports"))
#> tools
#> min lq mean median uq max neval
#> 5.316552 5.607365 6.054568 5.674359 6.633308 7.041258 5
#> 18.767340 19.387588 21.739127 21.581457 23.526169 25.433079 5
#> 411.589734 449.179354 458.526354 465.431262 468.440211 497.991207 5
由 reprex package (v0.3.0) 于 2021 年 6 月 25 日创建
证明在旧 R 版本下 tools
解决方案是错误的。在 R 3.6.3 下测试。
paks <- tools::package_dependencies("shiny", which = c("Depends", "Imports"), recursive = TRUE)[[1]]
"lifecycle" %in% paks
#> [1] TRUE
any(c(paks, "shiny") %in% tools::dependsOnPkgs("lifecycle"))
#> [1] FALSE
由 reprex package (v0.3.0) 于 2021 年 6 月 25 日创建
答案 3 :(得分:1)
另一个简洁的解决方案是来自库recursivePackageDependencies
的内部函数packrat
。但是,程序包必须安装在计算机上的某些库中。优点是它也适用于自制的非CRAN包。例如:
packrat:::recursivePackageDependencies("ggplot2",lib.loc = .libPaths()[1])
,并提供:
[1] "R6" "RColorBrewer" "Rcpp" "colorspace" "dichromat" "digest" "gtable"
[8] "labeling" "lazyeval" "magrittr" "munsell" "plyr" "reshape2" "rlang"
[15] "scales" "stringi" "stringr" "tibble" "viridisLite"
答案 4 :(得分:1)
令我惊讶的是,没有人提到tools::package_dependencies()
这是最简单的解决方案,并且有一个recursive
自变量(公认的解决方案不提供)。
简单示例查看CRAN上前200个程序包的递归依赖性:
library(tidyverse)
avail_pks <- available.packages()
deps <- tools::package_dependencies(packages = avail_pks[1:200, "Package"], recursive = TRUE)
tibble(Package=names(deps),
data=map(deps, as_tibble)) %>%
unnest(data)
#> # A tibble: 7,125 x 2
#> Package value
#> <chr> <chr>
#> 1 A3 xtable
#> 2 A3 pbapply
#> 3 A3 parallel
#> 4 A3 stats
#> 5 A3 utils
#> 6 aaSEA DT
#> 7 aaSEA networkD3
#> 8 aaSEA shiny
#> 9 aaSEA shinydashboard
#> 10 aaSEA magrittr
#> # … with 7,115 more rows
由https://www.php.net/manual/en/language.oop5.variance.php(v0.3.0)于2020-12-04创建
答案 5 :(得分:0)
试试这个:tools::package_dependencies(recursive = TRUE)$package_name
举个例子——这里是 dplyr 的依赖:
tools::package_dependencies(recursive = TRUE)$dplyr
[1] "ellipsis" "generics" "glue" "lifecycle" "magrittr" "methods"
[7] "R6" "rlang" "tibble" "tidyselect" "utils" "vctrs"
[13] "cli" "crayon" "fansi" "pillar" "pkgconfig" "purrr"
[19] "digest" "assertthat" "grDevices" "utf8" "tools"