如何仅重新编码R中的选定列

时间:2017-03-13 15:16:54

标签: r dataframe

我有一个data frame,其中包含以下列名称和值:

|     ss1         |     ss2      |      ss3        |          
|Strongly Agree   |Disagree      |Agree            |
|Agree            |Agree         |Disagree         |
|Strongly Disagree|Agree         |Disagree         |
|Disagree         |Strongly Agree|Strongly Disagree|

我正在寻找一种只重新编码列ss1ss3的方法 以这种方式

Strongly Agree - 1
Agree - 2
Disagree - 3
Strongly Disagree - 4

但是ss2列应该反向重新编码意味着强烈不同意 - 1,不同意 - 2,同意 - 3并非常同意 - 4 到目前为止,我尝试了以下代码:

If((names(df=="ss1")) |(names(df=="ss3"))) {
   lapply(df, 
     FUN = function(x) recode(x, 
        "'Strongly Disagree'=4; 
         'Disagree'=3; 
         'Agree'=2; 
         'Strongly Agree'=1; 
         'No Opinion'=''"))}

我知道我的执行语句只能用于重新编码所有列。有没有办法将重新编码仅限制为与IF表达式匹配的列名?

还有一种方法可以使用逻辑' OR'在我的IF条件下?

我想保持IF条件的原因是因为我想匹配列名,然后给出重新编码条件。

输出如下:

|     ss1         |     ss2      |      ss3        |          
|1                |2             |2                |
|2                |3             |3                |
|4                |3             |3                |
|3                |4             |4                |

如果问题有点不清楚,我很抱歉。

2 个答案:

答案 0 :(得分:2)

以下是dplyr的使用方法。如果要重新编码列,请将mutate_atrecode一起使用(如照片所示)。您需要2个不同的mutate_at,因为ss1,ss3和ss2的顺序不同。

library(dplyr)
df1  <- read.table(text="ss1              ss2            ss3
'Strongly Agree'   Disagree      Agree
Agree            Agree         Disagree
'Strongly Disagree' Agree         Disagree
Disagree         'Strongly Agree' 'Strongly Disagree'", header=TRUE, stringsAsFactors=FALSE)

df1 %>%
mutate_at(.cols= vars(ss1,ss3),
 .funs = funs(recode(., 'Strongly Disagree' = 4, 'Disagree' = 3, 'Agree' = 2,
 'Strongly Agree' = 1, .default = NA_real_)) ) %>%
mutate_at(.cols= vars(ss2),
 .funs = funs(recode(., 'Strongly Disagree' = 1, 'Disagree' = 2, 'Agree' = 3,
 'Strongly Agree' = 4, .default = NA_real_)) )
  ss1 ss2 ss3
1   1   2   2
2   2   3   3
3   4   3   3
4   3   4   4

答案 1 :(得分:2)

使用data.table

的快速解决方案
library(data.table)

# function to reclassify columns
  myfun = function(x)  { ifelse(x=='Strongly Disagree', 4,
                       ifelse(x=='Disagree', 3,
                       ifelse(x=='Agree', 2,
                       ifelse(x=='Strongly Agree', 1,"")))) }

# indicate which columns should be transformed
  cols <- c('ss1', 'ss3')

# Reclassify columns
  setDT(df1)[, (cols) := lapply (.SD, myfun), .SDcols=cols]

或者按照@Frank的建议使用联接:

library(data.table)
setDT(df1)

cols <- c('ss1', 'ss3')
recDT = data.table(
  old = c('Strongly Disagree', 'Disagree', 'Agree', 'Strongly Agree'), 
  new = 4:1)

for (col in cols) df1[recDT, on=setNames("old", col), paste0(col, "_new") := i.new]