替换数据框列中的值

时间:2014-09-18 13:22:20

标签: r dataframe vectorization

给定一个具有唯一值的列的大型数据框

(一,二,三,四,五,六,七,八)

我想替换一些值。例如,每次出现“' ONE'应替换为' 1'和

'FOUR' -> '2SQUARED'
'FIVE' -> '5'
'EIGHT' -> '2CUBED'

其他值应保持不变。

IF / ELSE将永远运行。如何应用矢量化解决方案? match()是一种可靠的方法吗?

3 个答案:

答案 0 :(得分:0)

尝试使用基础R:

data = structure(list(vals = structure(c(4L, 8L, 7L, 3L, 2L, 6L, 5L, 
1L), .Label = c("EIGHT", "FIVE", "FOUR", "ONE", "SEVEN", "SIX", 
"THREE", "TWO"), class = "factor")), .Names = "vals", class = "data.frame", row.names = c(NA, 
-8L))

initial = c('ONE', 'FOUR', 'FIVE', 'EIGHT')
final = c('1','2SQUARED', '5', '2CUBED')

myfn = function(ddf, init, fin){
    refdf = data.frame(init,fin)
    ddf$new = refdf[match(ddf$vals, init), 'fin']
    ddf$new = as.character(ddf$new)
    ndx = which(is.na(ddf$new))
    ddf$new[ndx]= as.character(ddf$vals[ndx])
    ddf
}

myfn(data, initial, final)

   vals      new
1   ONE        1
2   TWO      TWO
3 THREE    THREE
4  FOUR 2SQUARED
5  FIVE        5
6   SIX      SIX
7 SEVEN    SEVEN
8 EIGHT   2CUBED
> 

答案 1 :(得分:0)

使用@rnso数据集

library(plyr)
transform(data, vals = mapvalues(vals, 
          c('ONE', 'FOUR', 'FIVE', 'EIGHT'),
          c('1','2SQUARED', '5', '2CUBED'))) 
#       vals
# 1        1
# 2      TWO
# 3    THREE
# 4 2SQUARED
# 5        5
# 6      SIX
# 7    SEVEN
# 8   2CUBED

答案 2 :(得分:0)

您的专栏可能是factor。试一试。使用rnso的data,我建议您首先创建两个值向量来更改,并将值更改为

from <- c("FOUR", "FIVE", "EIGHT")
to <- c("2SQUARED", "5", "2CUBED")

然后用

替换因子
with(data, levels(vals)[match(from, levels(vals))] <- to)

这给出了

data
#       vals
# 1      ONE
# 2      TWO
# 3    THREE
# 4 2SQUARED
# 5        5
# 6      SIX
# 7    SEVEN
# 8   2CUBED