我遇到了一个问题,即如果满足条件,则将值替换为另一个值。我使用自己的函数data_manip
,我可以在需要时分配或添加任何其他条件。
但是,当我尝试使用此data_manip
函数时,它会使用指定的值更改该组中的所有值。但该特定群体中的其他值不符合这一条件。
这是我尝试过的,
df <- data.frame(percent = c(0.6, 0.7,1, 0.5,0.5,1,0.4,0.6,1),
type = rep(c("good", "bad","ugly"),each=3), smoke=rep(c('Visky','Wine','Wine'),3),
sex=rep(c('male','male','female'),3))
> df
percent type smoke sex
1 0.6 good Visky male
2 0.7 good Wine male
3 1.0 good Wine female
4 0.5 bad Visky male
5 0.5 bad Wine male
6 1.0 bad Wine female
7 0.4 ugly Visky male
8 0.6 ugly Wine male
9 1.0 ugly Wine female
data_manip <- function(x,gr){
if(grepl('goo|ug',gr)&&x<1){
x[x==0.6] <- 1
}
else
x
}
df%>%
group_by(type)%>%
mutate(percent_new=data_manip(percent,type))
给出
# A tibble: 9 x 5
# Groups: type [3]
percent type smoke sex percent_new
<dbl> <fctr> <fctr> <fctr> <dbl>
1 0.6 good Visky male 1.0
2 0.7 good Wine male 1.0
3 1.0 good Wine female 1.0
4 0.5 bad Visky male 0.5
5 0.5 bad Wine male 0.5
6 1.0 bad Wine female 1.0
7 0.4 ugly Visky male 1.0
8 0.6 ugly Wine male 1.0
9 1.0 ugly Wine female 1.0
如果条件不适合他们,我想保留原始percent
值。
预期输出
# A tibble: 9 x 5
# Groups: type [3]
percent type smoke sex percent_new
<dbl> <fctr> <fctr> <fctr> <dbl>
1 0.6 good Visky male 1.0
2 0.7 good Wine male 0.7
3 1.0 good Wine female 1.0
4 0.5 bad Visky male 0.5
5 0.5 bad Wine male 0.5
6 1.0 bad Wine female 1.0
7 0.4 ugly Visky male 0.4
8 0.6 ugly Wine male 1.0
9 1.0 ugly Wine female 1.0
答案 0 :(得分:2)
您当前的data_manip
函数似乎没有矢量化,因为它使用if (cond) { ... } else { ... }
,它通常只检查单个值,并且可能默认为向量的第一个元素。函数的矢量化版本如下所示:
data_manip <- function(x,gr){
ifelse(grepl('goo|ug', gr) & x == 0.6, 1, x)
}
并给出了预期的结果:
> df%>%
+ group_by(type)%>%
+ mutate(percent_new=data_manip(percent,type))
# A tibble: 9 x 5
# Groups: type [3]
percent type smoke sex percent_new
<dbl> <fctr> <fctr> <fctr> <dbl>
1 0.6 good Visky male 1.0
2 0.7 good Wine male 0.7
3 1.0 good Wine female 1.0
4 0.5 bad Visky male 0.5
5 0.5 bad Wine male 0.5
6 1.0 bad Wine female 1.0
7 0.4 ugly Visky male 0.4
8 0.6 ugly Wine male 1.0
9 1.0 ugly Wine female 1.0
使用ifelse
进行矢量化条件检查。
答案 1 :(得分:2)
这似乎是case_when
对其有用的问题。
试试这个:
library(tidyverse)
df %>%
mutate(new_percentage = case_when(type == "good" & percent == 0.6 ~ 1,
type == "ugly" & percent == 0.6 ~ 1,
TRUE ~ as.double(.$percent)))
给出了:
# A tibble: 9 x 5
percent type smoke sex new_percentage
<dbl> <fctr> <fctr> <fctr> <dbl>
1 0.6 good Visky male 1.0
2 0.7 good Wine male 0.7
3 1.0 good Wine female 1.0
4 0.5 bad Visky male 0.5
5 0.5 bad Wine male 0.5
6 1.0 bad Wine female 1.0
7 0.4 ugly Visky male 0.4
8 0.6 ugly Wine male 1.0
9 1.0 ugly Wine female 1.0