Question

我的应用正在使用xls包的xlsx功能阅读read_excel和readxl个文件。

在阅读xls或xlsx文件时，先前不知道序列和确切的列数。有 15个预定义列，其中 10列 强制，其余 5列 可选< / strong>即可。因此，该文件将始终具有最小10 列和最多15 列。

我需要将col-types指定为必填10列。我能想到的唯一方法是使用列名来指定col_types，因为我知道该文件包含所有10列是必需的，但它们是随机序列。

我试着找出这样做的方法，但未能这样做。

任何人都可以帮我找到按列名分配col_types的方法吗？

Answer 1

我通过以下解决方法解决了这个问题。但这不是解决这个问题的最佳方法。我已经读取excel文件两次，如果文件的数据量非常大，这会对性能产生影响。

首先阅读： 构建列数据类型向量 - 读取文件以检索列信息（如列名，列数及其类型）并构建column_data_types vector，其中datatype代表文件中的每一列。

#reading .xlsx file
site_data_columns <- read_excel(paste(File$datapath, ".xlsx", sep = ""))

site_data_column_names <- colnames(site_data_columns)

for(i in 1 : length(site_data_column_names)){  

    #where date is a column name
    if(site_data_column_names[i] == "date"){
         column_data_types[i] <- "date"

         #where result is a column name
         } else if (site_data_column_names[i] == "result") {
                      column_data_types[i] <- "numeric"

         } else{
                column_data_types[i] <- "text"
        }
}

第二次阅读： 阅读文件内容 - 通过col_types vector column_data_types提供data types参数来阅读Excel文件列#reading .xlsx file site_data <- read_excel(paste(File$datapath, ".xlsx", sep = ""), col_types = column_data_types)。

let firstDictionary = NSMutableDictionary(dictionary: myTempArr[0])

是否有任何方法可以在R

1 个答案: