如何重新分类dataframe列?

时间:2015-04-24 19:48:18

标签: r dataframe multiple-columns

我正在重新分类一个数据框列中的值,并将这些值添加到另一列。以下脚本尝试将重分类函数应用于列,并将值输出到数据框中的另一列。

a = c(1,2,3,4,5,6,7)

x = data.frame(a)

# Reclassify values in x$a
reclass = function(x){
  # 1 - Spruce/Fir          = 1
  # 2 - Lodgepole Pine      = 1
  # 3 - Ponderosa Pine      = 1
  # 4 - Cottonwood/Willow   = 0
  # 5 - Aspen               = 0
  # 6 - Douglas-fir         = 1
  # 7 - Krummholz           = 1
  if(x == 1) return(1)
  if(x == 2) return(1)
  if(x == 3) return(1)
  if(x == 4) return(0)
  if(x == 5) return(0)
  if(x == 6) return(1)
  if(x == 7) return(1)
}

# Add a new column
x$b = 0

# Apply function on new column
b = lapply(x$b, reclass(x$a))

错误消息:

> b = lapply(x$b, reclass(x$a))
Error in match.fun(FUN) : 
  'reclass(x$a)' is not a function, character or symbol
In addition: Warning message:
In if (x == 1) return(1) :
  the condition has length > 1 and only the first element will be used

预期输出应如下所示

a = c(1,2,3,4,5,6,7)
b = c(1,1,1,0,0,1,1)
x = data.frame(a, b)

我已经阅读了一个看似相似的问题(Reclassify select columns in Data Table),虽然它似乎是在改变列的实际类(例如数字)。

如何从数据框中的一列获取值,应用我的重分类函数,并将值输出到新列?

2 个答案:

答案 0 :(得分:6)

在这里,我只是这样做(

coniferTypes <- c(1,2,3,6,7)
x$b <- as.integer(x$a %in% coniferTypes)
x
#   a b
# 1 1 1
# 2 2 1
# 3 3 1
# 4 4 0
# 5 5 0
# 6 6 1
# 7 7 1

答案 1 :(得分:3)

你可以这样做:

library(dplyr)
mutate(x, b = ifelse(a %in% c(1, 2, 3, 6, 7), 1, 0))