Question

我有一个数据框，我想保留其中任何价格列大于某个值的行（水果）。

这是一个可复制的示例，您可以将其直接复制并粘贴到R中：

fruit = c("apple","orange","banana","berry") #1st col
ID = c(123,3453,4563,3235) #2nd col
price1 = c(3,5,10,20) #3rd col
price2 = c(5,7,9,2) #4th col
price3 = c(4,1,11,8) #5th col

df = as.data.frame(cbind(fruit,ID,price1,price2,price3)) #combine into a dataframe

price_threshold = 10 #define a price

我只想获取任何价格大于10的水果，在这种情况下为香蕉和浆果

我期望的输出是以下两行：

banana 4563 10  9  11
berry  3235 20  2   8

我尝试过这样的事情：

output = df[which(df[,3:5] > price_threshold),]

但是没有用。

这接近this post，但在这里我想看看最后三列中的任何值，而不仅仅是一列。

有什么建议吗？

Answer 1

一行和可读的解决方案。

df[pmax(df$price1, df$price2, df$price3) > 10, ]

Answer 2

首先，最好将data.frame初始化为

df = data.frame(fruit,ID,price1,price2,price3)

因此，变量未解析为因子。然后，您可以通过以下方式获得预期的结果：

df[rowSums(df[,3:5] > price_threshold)>0,]

结果：

   fruit   ID price1 price2 price3
3 banana 4563     10      9     11
4  berry 3235     20      2      8

希望这会有所帮助！

Answer 3

数据框中df中的所有列都是因素，这就是为什么它不起作用的原因。无需使用cbind()

fruit = c("apple","orange","banana","berry") #1st col
ID = c(123,3453,4563,3235) #2nd col
price1 = c(3,5,10,20) #3rd col
price2 = c(5,7,9,2) #4th col
price3 = c(4,1,11,8) #5th col

df <- data.frame(fruit, ID, price1, price2, price3)
df
#    fruit   ID price1 price2 price3
# 1  apple  123      3      5      4
# 2 orange 3453      5      7      1
# 3 banana 4563     10      9     11
# 4  berry 3235     20      2      8

df[which(df[,3] > price_thres | df[,4] > price_thres | df[,5] > price_thres),]
#    fruit   ID price1 price2 price3
# 3 banana 4563     10      9     11
# 4  berry 3235     20      2      8

Answer 4

受Sam启发的程序化解决方案：

price_cols <- grep("^price", names(df), value = TRUE)
price_cols
[1] "price1" "price2" "price3"

df[do.call(pmax, df[price_cols]) > price_threshold, ]
   fruit   ID price1 price2 price3
3 banana 4563     10      9     11
4  berry 3235     20      2      8

如果任何特定列集中的值满足特定条件，则返回整行

4 个答案: