在数据框中工作,我想根据另一列中的值来操作列值。这是我可重现的代码:
# four items
items <- c("coke", "tea", "shampoo","aspirin")
# scores for each item
score <- as.numeric(c(65,30,45,20))
# making a data frame of the two vectors created
df <- as.data.frame(cbind(items,score))
# score for coke is 65 and for tea it is 30. I want to
# double score for tea OR coke if the score is below 50
ifelse(df$score[df$items %in% c("coke", "tea")] < 50, df$score*2, df$score)
#the above return NULL values with warning
#the statement df$score[df$items %in% c("coke", "tea")] does pull coke and tea scores
df$score[df$items %in% c("coke", "tea")]
非常感谢您的帮助
答案 0 :(得分:1)
现在应该可以解决这个问题:
items <- c("coke", "tea", "shampoo","aspirin")
# scores for each item
score <- as.numeric(c(65,30,45,20))
尝试使用data.frame
代替as.data.frame
。使用后者会导致值转换为因子
# making a data frame of the two vectors created
df <- data.frame(items, score)
df
items score
1 coke 65
2 tea 30
3 shampoo 45
4 aspirin 20
# score for coke is 65 and for tea it is 30. I want to
# double score for tea OR coke if the score is below 50
df$score[df$items %in% c("coke", "tea")] = ifelse(df$score[df$items %in% c("coke", "tea")] < 50, df$score*2, df$score)
df
items score
1 coke 65
2 tea 60
3 shampoo 45
4 aspirin 20
如果最终您的项目有重复条目,则此方法不起作用。
# New data with an added entry for item = coke and score = 15:
items <- c("coke", "tea", "shampoo","aspirin","coke")
# scores for each item
score <- c(65,30,45,20,15)
# making a data frame of the two vectors created
df <- data.frame(items, score)
# using the method from above the last entry get converted to a value of 90
# instead of 30
df$score[df$items %in% c("coke", "tea")] = ifelse(df$score[df$items %in% c("coke", "tea")] < 50, df$score*2, df$score)
df
items score
1 coke 65
2 tea 60
3 shampoo 45
4 aspirin 20
5 coke 90
因此,如果您有任何可能有重复条目的情况,则必须使用此方法
df <- data.frame(items, score)
df$score[df$items %in% c("coke", "tea") & df$score < 50] <- 2* df$score[df$items %in% c("coke", "tea") & df$score < 50]
df
items score
1 coke 65
2 tea 60
3 shampoo 45
4 aspirin 20
5 coke 30
答案 1 :(得分:0)
您的问题不需要if语句。您可以组合两个逻辑语句。
逻辑1:df$items %in% c("coke", "tea")
逻辑2:df$score < 50
通过过滤这两个逻辑语句的数据帧,您可以将得分相乘。和= &
,或= |
。
df$score[df$items %in% c("coke", "tea") | df$score < 50] <- 2* df$score[df$items %in% c("coke", "tea") | df$score < 50]
答案 2 :(得分:0)
items <- c("coke", "tea", "shampoo","aspirin")
score <- as.numeric(c(65,30,45,20))
如果您通过以下方式调用data.frame(),则可以避免将得分列转换为因子。
df <- data.frame(items=items,score=score)
您不需要if语句。您可以根据两个逻辑语句简单地提取您感兴趣的值:
df[df$score<50 & df$items %in% c("coke", "tea"), "score"] <- 2 * df[df$score<50 & df$items %in% c("coke", "tea"), "score"]
df$score<50 & df$items %in% c("coke", "tea")
选择符合这两个条件的行,即可以选择焦炭或茶,并且得分低于50。
"score"
仅选择分数列
<-
右侧的声明提取相同的值并将它们乘以2。
答案 3 :(得分:0)
if语句的语法不太正确,看起来您试图以类似于在MS Excel中使用它的方式调用它。不幸的是,它没有做到这一点。
我建议您参加R课程的介绍(许多是免费在线提供),例如:
https://campus.datacamp.com/courses/free-introduction-to-r/chapter-1-intro-to-basics-1?ex=1
至于你的问题,这里有一个解决方案(如果我正确理解你的问题)。
item <- c("coke", "tea", "shampoo", "aspirin")
score <- as.numeric(c(65, 30, 45, 20))
df <- data.frame(item, score)
for (i in 1:length(df$item)){
if ((df$item[i] == "coke" | df$item[i] == "tea") & df$score[i] < 50) {
df$score[i] <- df$score[i] * 2
}
}
View(df)
您需要注意的是,如果您现在查看更新后的数据框(&#34; df&#34;),则只有项目&#34;茶&#34;的分数已经加倍,因为它符合两个标准(即item = coke OR tea; AND它的相关分数低于50)。
希望这会有所帮助,祝你好运。