如何将所有非数字列转换为因数(无数字列)

时间:2019-03-29 14:50:51

标签: select cluster-computing

  1. 我不知道如何使用select()和mutate()转换列

  2. 我不知道哪个是数字的还是非数字的

  3. 我不知道选择(...)选择(-...)。我为什么要使用(-...)?

我试图定义哪个是非数字的,以便可以使用select()函数

data.raw <-  paste0(prefix.dir, "UCI_Credit_Card.csv") %>% readr::read_csv()

if (categorical == "nominal") {

  # convert all non-numeric columns to factor (no numeric columns)
  ## Method 1: divide, process & merge
  colfac <- data.raw %>% 
    select(...) %>% 
    mutate_all(as.factor) %T>% print

  dataset <- data.raw %>% 
    select(-...) %>% 
    cbind(colfac)



> data.raw %>% glimpse
Observations: 30,000
Variables: 24
$ LIMIT_BAL                  <dbl> 20000, 120000, 90000, 50000, 50000, 50000, 500...
$ SEX                        <dbl> 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 2, 2, 2, 1, 1, 2...
$ EDUCATION                  <dbl> 2, 2, 2, 2, 2, 1, 1, 2, 3, 3, 3, 1, 2, 2, 1, 3...
$ MARRIAGE                   <dbl> 1, 2, 2, 1, 1, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 3...
$ AGE                        <dbl> 24, 26, 34, 37, 57, 37, 29, 23, 28, 35, 34, 51...
$ PAY_0                      <dbl> 2, -1, 0, 0, -1, 0, 0, 0, 0, -2, 0, -1, -1, 1,...
$ PAY_2                      <dbl> 2, 2, 0, 0, 0, 0, 0, -1, 0, -2, 0, -1, 0, 2, 0...
$ PAY_3                      <dbl> -1, 0, 0, 0, -1, 0, 0, -1, 2, -2, 2, -1, -1, 2...
$ PAY_4                      <dbl> -1, 0, 0, 0, 0, 0, 0, 0, 0, -2, 0, -1, -1, 0, ...
$ PAY_5                      <dbl> -2, 0, 0, 0, 0, 0, 0, 0, 0, -1, 0, -1, -1, 0, ...
$ PAY_6                      <dbl> -2, 2, 0, 0, 0, 0, 0, -1, 0, -1, -1, 2, -1, 2,...
$ BILL_AMT1                  <dbl> 3913, 2682, 29239, 46990, 8617, 64400, 367965,...
$ BILL_AMT2                  <dbl> 3102, 1725, 14027, 48233, 5670, 57069, 412023,...
$ BILL_AMT3                  <dbl> 689, 2682, 13559, 49291, 35835, 57608, 445007,...
$ BILL_AMT4                  <dbl> 0, 3272, 14331, 28314, 20940, 19394, 542653, 2...
$ BILL_AMT5                  <dbl> 0, 3455, 14948, 28959, 19146, 19619, 483003, -...
$ BILL_AMT6                  <dbl> 0, 3261, 15549, 29547, 19131, 20024, 473944, 5...
$ PAY_AMT1                   <dbl> 0, 0, 1518, 2000, 2000, 2500, 55000, 380, 3329...
$ PAY_AMT2                   <dbl> 689, 1000, 1500, 2019, 36681, 1815, 40000, 601...
$ PAY_AMT3                   <dbl> 0, 1000, 1000, 1200, 10000, 657, 38000, 0, 432...
$ PAY_AMT4                   <dbl> 0, 1000, 1000, 1100, 9000, 1000, 20239, 581, 1...
$ PAY_AMT5                   <dbl> 0, 0, 1000, 1069, 689, 1000, 13750, 1687, 1000...
$ PAY_AMT6                   <dbl> 0, 2000, 5000, 1000, 679, 800, 13770, 1542, 10...
$ default.payment.next.month <dbl> 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0...

0 个答案:

没有答案