Question

我正在尝试阅读来自此来源http://ec.europa.eu/energy/observatory/reports的所有欧盟每周石油公告数据文件，特别是所有带有＆＃34; raw_data＆＃34;的xls文件。在文件名中。

library(rvest)
library(readxl)
library(tidyverse)

url <- "http://ec.europa.eu/energy/observatory/reports/"
files <- read_html(url) %>% html_nodes("a") %>% .[grepl("raw",.)] %>% html_attr("href")

但是，read_excel无法正确解析excel文件的所有列，只返回第一个（日期）列。请参阅以下文件161的示例。

t <- tempfile(fileext = ".xls")
download.file(paste0(url, files[161]), t, mode="wb")
data <- read_excel(t)
unlink(t)

刚刚返回

A tibble: 126 x 1
`Prices in force on`
<dttm>              
1 2018-03-19 00:00:00 
2 2018-03-19 00:00:00
....

我知道我可以使用excelcnv.exe下载所有xls文件并将它们转换为.xlsx或.csv文件，但这相对较慢，并且拥有纯R解决方案会很好。知道如何从excel文件中读取所有信息吗？非常感谢！

readxl无法读取excel 1997-2003工作簿的所有列

0 个答案: