Question

在数据框中工作，我想根据另一列中的值来操作列值。这是我可重现的代码：

# four items
items <- c("coke", "tea", "shampoo","aspirin")

# scores for each item
score <- as.numeric(c(65,30,45,20))

# making a data frame of the two vectors created
df <- as.data.frame(cbind(items,score))

# score for coke is 65 and for tea it is 30.  I want to
# double score for tea OR coke if the score is below 50

ifelse(df$score[df$items %in% c("coke", "tea")] < 50, df$score*2, df$score)

#the above return NULL values with warning

#the statement df$score[df$items %in% c("coke", "tea")] does pull coke and tea scores

df$score[df$items %in% c("coke", "tea")]

非常感谢您的帮助

Answer 1

现在应该可以解决这个问题：

items <- c("coke", "tea", "shampoo","aspirin")

# scores for each item
score <- as.numeric(c(65,30,45,20))

尝试使用data.frame代替as.data.frame。使用后者会导致值转换为因子

# making a data frame of the two vectors created
df <- data.frame(items, score)

df
    items score
1    coke    65
2     tea    30
3 shampoo    45
4 aspirin    20


# score for coke is 65 and for tea it is 30.  I want to
# double score for tea OR coke if the score is below 50

df$score[df$items %in% c("coke", "tea")] = ifelse(df$score[df$items %in% c("coke", "tea")] < 50, df$score*2, df$score)

df
    items score
1    coke    65
2     tea    60
3 shampoo    45
4 aspirin    20

如果最终您的项目有重复条目，则此方法不起作用。

# New data with an added entry for item = coke and score = 15:
items <- c("coke", "tea", "shampoo","aspirin","coke")
# scores for each item
score <- c(65,30,45,20,15)

# making a data frame of the two vectors created
df <- data.frame(items, score)


# using the method from above the last entry get converted to a value of 90
# instead of 30
df$score[df$items %in% c("coke", "tea")] = ifelse(df$score[df$items %in% c("coke", "tea")] < 50, df$score*2, df$score)

df
    items score
1    coke    65
2     tea    60
3 shampoo    45
4 aspirin    20
5    coke    90

因此，如果您有任何可能有重复条目的情况，则必须使用此方法

df <- data.frame(items, score)

df$score[df$items %in% c("coke", "tea") & df$score < 50] <- 2* df$score[df$items %in% c("coke", "tea") & df$score < 50]

df
    items score
1    coke    65
2     tea    60
3 shampoo    45
4 aspirin    20
5    coke    30

Answer 2

您的问题不需要if语句。您可以组合两个逻辑语句。

逻辑1：df$items %in% c("coke", "tea")

逻辑2：df$score < 50

通过过滤这两个逻辑语句的数据帧，您可以将得分相乘。和= &，或= |。

df$score[df$items %in% c("coke", "tea") | df$score < 50] <- 2* df$score[df$items %in% c("coke", "tea") | df$score < 50]

Answer 3

items <- c("coke", "tea", "shampoo","aspirin")
score <- as.numeric(c(65,30,45,20))

如果您通过以下方式调用data.frame（），则可以避免将得分列转换为因子。

df <- data.frame(items=items,score=score)

您不需要if语句。您可以根据两个逻辑语句简单地提取您感兴趣的值：

df[df$score<50 & df$items %in% c("coke", "tea"), "score"] <- 2 * df[df$score<50 & df$items %in% c("coke", "tea"), "score"]

df$score<50 & df$items %in% c("coke", "tea")选择符合这两个条件的行，即可以选择焦炭或茶，并且得分低于50。
"score"仅选择分数列
<-右侧的声明提取相同的值并将它们乘以2。

Answer 4

if语句的语法不太正确，看起来您试图以类似于在MS Excel中使用它的方式调用它。不幸的是，它没有做到这一点。

我建议您参加R课程的介绍（许多是免费在线提供），例如：

https://campus.datacamp.com/courses/free-introduction-to-r/chapter-1-intro-to-basics-1?ex=1

至于你的问题，这里有一个解决方案（如果我正确理解你的问题）。

item <- c("coke", "tea", "shampoo", "aspirin")
score <- as.numeric(c(65, 30, 45, 20))

df <- data.frame(item, score)

for (i in 1:length(df$item)){
  if ((df$item[i] == "coke" | df$item[i] == "tea") & df$score[i] < 50) {
    df$score[i] <- df$score[i] * 2
  }
}

View(df)

您需要注意的是，如果您现在查看更新后的数据框（＆＃34; df＆＃34;），则只有项目＆＃34;茶＆＃34;的分数已经加倍，因为它符合两个标准（即item = coke OR tea; AND它的相关分数低于50）。

希望这会有所帮助，祝你好运。

在数据框中工作，我想基于另一列中的值来操作列值

4 个答案: