替换R数据帧中的列中的值

时间:2014-12-29 07:03:32

标签: r if-statement

我想替换R数据帧(data1)中的某些值。我正在做数据清理。

数据框data1中有n列。在其中一个Article_Description列中,我想做以下操作。怎么能在R

中完成

if data1$Article_Description in ('snova glide 4m','SNOVA Glide 4M','SNova Glide 4 M') then data1$Article_Description='SNOVA Glide 4M'; if data1$Article_Description in ('aSTAR Ride 4M','astar ride 4m') then data1$Article_Description='astar ride 4m'; if data1$Article_Description in ('CC Fresh M','cc fresh m') then data1$Article_Description='CC Fresh M'; if data1$Article_Description in ('cc ride m','CC Ride M') then data1$Article_Description='CC Ride M'; if data1$Article_Description in ('astar solution 2m','aSTAR Solution 2M') then data1$Article_Description='astar solution 2m'; if data1$Article_Description in ('astar salvation 3m','aSTAR Salvation 3M') then data1$Article_Description='astar salvation 3m'; if data1$Article_Description in ('cc chill m','CC Chill M') then data1$Article_Description='CC Chill M';

2 个答案:

答案 0 :(得分:0)

两个问题:1)您需要使用%in%而非in,以及2)if函数未向量化,因此您无法通过将完整向量传递给有效结果它。使用ifelse或使用{<-

的逻辑索引

我会做第一对,因为模式应该清晰(我很容易感到厌倦):

data[ data1$Article_Description %in% ('snova glide 4m','SNOVA Glide 4M','SNova Glide 4 M'), 
      "Article_Description"] <- 'SNOVA Glide 4M'

data[ data1$Article_Description %in% ('aSTAR Ride 4M','astar ride 4m'), 
      "Article_Description"] <-  'astar ride 4m'; 

答案 1 :(得分:0)

你可以试试这个:

v1 <- sub('(?<=\\d) (?=[a-z])', '', tolower(data1[,1]), perl=TRUE)
lvls <- levels(factor(v1))

data1$NewArticle_Description <- setNames(c(lvls[1:3], 'CC Chill M', 
   'CC Fresh M', 'CC Ride M', 'SNOVA Glide 4M') ,lvls)[v1]

 head(data1)
 #  Article_Description        Val NewArticle_Description
 #1          cc fresh m  0.1528656              CC Fresh M
 #2   aSTAR Solution 2M  0.4666355       astar solution 2m
 #3     SNova Glide 4 M -1.3486217          SNOVA Glide 4M
 #4          cc chill m -0.3713309              CC Chill M
 #5      SNOVA Glide 4M  2.0481950          SNOVA Glide 4M
 #6          CC Chill M -1.0303537              CC Chill M

数据

set.seed(25)
data1 <- data.frame(Article_Description= sample(c('snova glide 4m',
'SNOVA Glide 4M','SNova Glide 4 M', 'aSTAR Ride 4M','astar ride 4m', 
'CC Fresh M','cc fresh m','cc ride m','CC Ride M',  'astar solution 2m',
'aSTAR Solution 2M', 'astar salvation 3m','aSTAR Salvation 3M', 'cc chill m',
'CC Chill M'), 100, replace=TRUE), Val=rnorm(100), stringsAsFactors=FALSE)