Question

我正在尝试解决使用 readxl 包将xls数据导入R的问题。特定的xls文件有18列和472行，前7行有需要跳过的描述性文本。我只想从EDA的18列中选择 col 1,3,6：9 。它们具有混合类型，包括日期，数字和文本。

readxl似乎无法直接导入非连续列。我的计划是使用skip = 7来首先阅读整个工作表并使用select next步骤。但是，问题是readxl默认将日期类型猜为数字。 readxl中是否有一种方法可以按列名指定 col_types？

带有示例xlsx的可重现代码，用于演示。

library(readxl) xlsx_example <- readxl_example("datasets.xlsx") # read the entire table read_excel(xlsx_example) # select specific column to name - following code does not work read_excel(xlsx_example, col_types=col (Sepal.Length = "numeric"))

Answer 1

据我所知，您不能够按列名称指定col_types。但是，只能读取特定列。例如，

read_excel(xlsx_example, col_types=c("numeric", "skip", "numeric", "numeric", "skip"))

将导入第1,3和4列，并跳过第2列和第5列。您可以为18列执行此操作，但我认为这有点难以跟踪哪个列被导入为哪种类型。

另一种方法是使用col_types = "text"将所有列作为文本读入，然后按名称选择和转换变量。例如：

library(tidyverse)
library(readxl)
xlsx_example <- readxl_example("datasets.xlsx")
df <- read_excel(xlsx_example, col_types = "text")
df %>% 
  select(Sepal.Length, Petal.Length) %>% 
  mutate(Sepal.Length = as.numeric(Sepal.Length))
#> # A tibble: 150 x 2
#>    Sepal.Length Petal.Length
#>           <dbl>        <chr>
#>  1          5.1          1.4
#>  2          4.9          1.4
#>  3          4.7          1.3
#>  4          4.6          1.5
#>  5          5.0          1.4
#>  6          5.4          1.7
#>  7          4.6          1.4
#>  8          5.0          1.5
#>  9          4.4          1.4
#> 10          4.9          1.5
#> # ... with 140 more rows

Answer 2

所以我认为你可以做到：

<input type="text" [(ngModel)]="value" (blur)="changeFormat()">

如何使用readxl选择特定列和类型？

2 个答案: