Question

I have a DOI for an article, I am wondering if there are any R functions that can download the pdf file based on this DOI without the user having to download the pdf manually ?

Answer 1

您可以使用httr通过构建doi.org的网址并获取标题来查看DOI指向的位置：

library(httr)
headers = HEAD("http://doi.org/10.7150/ijms.11309")
headers$url
# [1] "http://www.medsci.org/v12p0264.htm"

在这种情况下，PDF似乎与该页面位于同一位置，但扩展名为.pdf。但是对于所有期刊来说，不都是正确的。

因此，对于本期刊，PDF位于：

sub(".htm$",".pdf",headers$url)
# [1] "http://www.medsci.org/v12p0264.pdf"

所以我可以这样做：

download.file(sub(".htm$",".pdf",headers$url),"paper.pdf")

获取PDF。

Answer 2

部分答案：

实际上这是一个难题，它与R无关...你能说明如何用任何语言或设置从DOI转换为PDF吗？

我能找到的最好的是：

Crosscite

您可以使用curl（可能因此$(document).ready(function(){ $('.content').hide(); }); $(document).on('click',".center",function(){ $(".content").slideUp(300); $(this).next(".content").slideDown(300); });或其他东西）来查询crossref内容协商系统。这可以为您的DOI返回引文。从那里开始，获取PDF很难......至少你可以获得一个URL来抓取PDF链接，如果你想走那条路。

这是jabRef用于将DOI转换为引文的方法。

Mendeley和Zotero之类的东西已经编写了从网页到PDF的解析器。但我不认为有一些现成的即用型方法可以做到这一点。

download an article using DOI in R

2 个答案: