如何导入字符列作为因子?

时间:2018-09-27 17:41:49

标签: r rstudio data-cleaning

我在.txt文件中包含以下数据集:

1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,Action
0,0,0,2,0,0,0,2,0,0,0,0,0,0,0,0,Up
2,0,0,0,2,0,0,0,0,0,0,2,0,0,0,0,Left
4,0,0,2,0,0,0,0,0,2,0,0,0,0,0,0,Left
4,2,0,2,0,2,0,0,0,0,0,0,0,0,0,0,Up
4,4,0,0,2,0,0,0,0,0,0,0,0,0,0,2,Up
8,0,0,0,2,0,0,0,2,0,0,0,2,0,0,0,Left

当我将数据集加载到RStudio中时,我希望它将最后一列Action转换为Factor类型。但是,它将其视为Character

我可以强迫它将其视为因素,但是它要求我

请插入逗号分隔的因素列表(您可以在下图中找到它)。但是,我不明白。我应该插入Levels的Factor还是将被整个Action列替换的值列表?

如何在最后一列中使用Factor导入数据集?

enter image description here

如果运行str(dt)会给我:

Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   2979 obs. of  17 variables:
 $ 1     : int  0 2 4 4 4 8 8 8 8 8 ...
 $ 2     : int  0 0 0 2 4 0 0 0 2 4 ...
 $ 3     : int  0 0 0 0 0 0 0 0 0 2 ...
 $ 4     : int  2 0 2 2 0 0 0 2 2 0 ...
 $ 5     : int  0 2 0 0 2 2 4 4 4 4 ...
 $ 6     : int  0 0 0 2 0 0 0 0 0 2 ...
 $ 7     : int  0 0 0 0 0 0 0 0 2 0 ...
 $ 8     : int  2 0 0 0 0 0 0 0 0 0 ...
 $ 9     : int  0 0 0 0 0 2 2 2 2 2 ...
 $ 10    : int  0 0 2 0 0 0 0 2 0 0 ...
 $ 11    : int  0 0 0 0 0 0 0 0 0 0 ...
 $ 12    : int  0 2 0 0 0 0 0 0 0 0 ...
 $ 13    : int  0 0 0 0 0 2 0 0 0 0 ...
 $ 14    : int  0 0 0 0 0 0 0 0 0 0 ...
 $ 15    : int  0 0 0 0 0 0 0 0 0 0 ...
 $ 16    : int  0 0 0 0 2 0 2 0 0 0 ...
 $ Action: chr  "Up" "Left" "Left" "Up" ...
 - attr(*, "spec")=List of 2
  ..$ cols   :List of 17
  .. ..$ 1     : list()
  .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
  .. ..$ 2     : list()
  .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
  .. ..$ 3     : list()
  .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
  .. ..$ 4     : list()
  .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
  .. ..$ 5     : list()
  .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
  .. ..$ 6     : list()
  .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
  .. ..$ 7     : list()
  .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
  .. ..$ 8     : list()
  .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
  .. ..$ 9     : list()
  .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
  .. ..$ 10    : list()
  .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
  .. ..$ 11    : list()
  .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
  .. ..$ 12    : list()
  .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
  .. ..$ 13    : list()
  .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
  .. ..$ 14    : list()
  .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
  .. ..$ 15    : list()
  .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
  .. ..$ 16    : list()
  .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
  .. ..$ Action: list()
  .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
  ..$ default: list()
  .. ..- attr(*, "class")= chr  "collector_guess" "collector"
  ..- attr(*, "class")= chr "col_spec"

2 个答案:

答案 0 :(得分:2)

我们指定特定列或所有列的col_types

library(readr)
read_csv(file, col_type = cols(Action = col_factor(levels = c("Up", "Left"))))
# A tibble: 6 x 17
#    `1`   `2`   `3`   `4`   `5`   `6`   `7`   `8`   `9`  `10`  `11`  `12`  `13`  `14`  `15`  `16` Action
#  <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <fct> 
#1     0     0     0     2     0     0     0     2     0     0     0     0     0     0     0     0 Up    
#2     2     0     0     0     2     0     0     0     0     0     0     2     0     0     0     0 Left  
#3     4     0     0     2     0     0     0     0     0     2     0     0     0     0     0     0 Left  
#4     4     2     0     2     0     2     0     0     0     0     0     0     0     0     0     0 Up    
#5     4     4     0     0     2     0     0     0     0     0     0     0     0     0     0     2 Up    
#6     8     0     0     0     2     0     0     0     2     0     0     0     2     0     0     0 Left  

答案 1 :(得分:1)

使用基础R,这也应该起作用:read.table("FileName.txt", sep = ",", header = TRUE, stringsAsFactors = TRUE)