考虑互联网上的一个文件(就像这个一样(注意https中的s)https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.xls
如何将文件的表格2读入R?
以下代码是所需内容的近似值(但失败)
url1<-'https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.xls'
p1f <- tempfile()
download.file(url1, p1f, mode="wb")
p1<-read_excel(path = p1f, sheet = 2)
答案 0 :(得分:9)
这适用于Windows:
library(readxl)
library(httr)
packageVersion("readxl")
# [1] ‘0.1.1’
GET(url1, write_disk(tf <- tempfile(fileext = ".xls")))
df <- read_excel(tf, 2L)
str(df)
# Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 20131 obs. of 8 variables:
# $ Code : chr "C115388" "C115800" "C115801" "C115802" ...
# $ Codelist Code : chr NA "C115388" "C115388" "C115388" ...
# $ Codelist Extensible (Yes/No): chr "No" NA NA NA ...
# $ Codelist Name : chr "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" ...
# $ CDISC Submission Value : chr "SIXMW1TC" "SIXMW101" "SIXMW102" "SIXMW103" ...
# $ CDISC Synonym(s) : chr "6 Minute Walk Functional Test Test Code" "SIXMW1-Distance at 1 Minute" "SIXMW1-Distance at 2 Minutes" "SIXMW1-Distance at 3 Minutes" ...
# $ CDISC Definition : chr "6 Minute Walk Test test code." "6 Minute Walk Test - Distance at 1 minute." "6 Minute Walk Test - Distance at 2 minutes." "6 Minute Walk Test - Distance at 3 minutes." ...
# $ NCI Preferred Term : chr "CDISC Functional Test 6MWT Test Code Terminology" "6MWT - Distance at 1 Minute" "6MWT - Distance at 2 Minutes" "6MWT - Distance at 3 Minutes" ...
答案 1 :(得分:3)
来自this issue on Github(#278):
支持更多通用输入的一些功能将从readr中拉出,此时readxl可以利用它。
因此,我们应该能够在(希望接近)未来将网址直接传递给read_excel()
。
答案 2 :(得分:0)
当我执行前3行时,我在temp文件夹中获得3个文件,而没有文件扩展名的文件名为filed3a2827f129
。如果我将扩展名`.xls``添加到该文件,可以使用OpenOffice.org的Calc函数打开它,这是查看器面板为sheet2显示的右上角。
所以我想知道粘贴该文件路径是否可以获取read_excel来打开它。它不会打开原始文件名,但会打开重命名的文件:
> p1<-read_excel( path ="/private/var/folders/yq/m3j1jqtj6hq6s5mq_v0jn3s80000gn/T/RtmpxfaZRt/filed3a2827f129.xls", sheet = 2)
DEFINEDNAME: 21 00 00 01 0b 00 00 00 02 00 00 00 00 00 00 0d 3b 00 00 00 00 a3 4e 00 00 07 00
DEFINEDNAME: 21 00 00 01 0b 00 00 00 02 00 00 00 00 00 00 0d 3b 00 00 00 00 a3 4e 00 00 07 00
DEFINEDNAME: 21 00 00 01 0b 00 00 00 02 00 00 00 00 00 00 0d 3b 00 00 00 00 a3 4e 00 00 07 00
DEFINEDNAME: 21 00 00 01 0b 00 00 00 02 00 00 00 00 00 00 0d 3b 00 00 00 00 a3 4e 00 00 07 00
> str(p1)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 20131 obs. of 8 variables:
$ Code : chr "C115388" "C115800" "C115801" "C115802" ...
$ Codelist Code : chr NA "C115388" "C115388" "C115388" ...
$ Codelist Extensible (Yes/No): chr "No" NA NA NA ...
$ Codelist Name : chr "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" ...
$ CDISC Submission Value : chr "SIXMW1TC" "SIXMW101" "SIXMW102" "SIXMW103" ...
$ CDISC Synonym(s) : chr "6 Minute Walk Functional Test Test Code" "SIXMW1-Distance at 1 Minute" "SIXMW1-Distance at 2 Minutes" "SIXMW1-Distance at 3 Minutes" ...
$ CDISC Definition : chr "6 Minute Walk Test test code." "6 Minute Walk Test - Distance at 1 minute." "6 Minute Walk Test - Distance at 2 minutes." "6 Minute Walk Test - Distance at 3 minutes." ...
$ NCI Preferred Term : chr "CDISC Functional Test 6MWT Test Code Terminology" "6MWT - Distance at 1 Minute" "6MWT - Distance at 2 Minutes" "6MWT - Distance at 3 Minutes" ...
答案 3 :(得分:0)
使用rio
R包。 link。这里是一个代表:
library(tidyverse)
library(rio)
url <- 'https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.xls'
rio::import(file = url,which = 2) %>%
glimpse()
#>
#> Rows: 30,995
#> Columns: 8
#> $ Code <chr> "C141663", "C141706", "C141707"...
#> $ `Codelist Code` <chr> NA, "C141663", "C141663", "C141...
#> $ `Codelist Extensible (Yes/No)` <chr> "No", NA, NA, NA, "No", NA, NA,...
#> $ `Codelist Name` <chr> "4 Stair Ascend Functional Test...
#> $ `CDISC Submission Value` <chr> "A4STR1TC", "A4STR101", "A4STR1...
#> $ `CDISC Synonym(s)` <chr> "4 Stair Ascend Functional Test...
#> $ `CDISC Definition` <chr> "4 Stair Ascend test code.", "4...
#> $ `NCI Preferred Term` <chr> "CDISC Functional Test 4 Stair ...
答案 4 :(得分:0)
一个更简单的解决方案是使用openxlsx 包。这是一个示例,可以根据您的需要进行调整:
library(openxlsx)
df = read.xlsx("https://archive.ics.uci.edu/ml/machine-learning-databases/00242/ENB2012_data.xlsx",sheet=1)