Question

如果没有dplyr，我们可以做这件事吗？我想选择那些行均值大于数据框总体均值的行。

我试图使用该功能，但无法正常工作。

tf12 <- apply(tf11, 2, function(x) filter(rowMeans(x) > mean(x)))

出现以下错误。

Error in rowMeans(x) : 'x' must be an array of at least two dimensions

Answer 1

我们可以unlist计算整个数据帧的mean，然后将其与rowMeans进行比较

tf11[rowMeans(tf11) > mean(unlist(tf11)), ]

如果数据帧中有na.rm = TRUE个值，请在mean和rowMeans中使用NA。

考虑一个例子

df <- data.frame(a = 1:10, b = 11:20)
df[rowMeans(df) > mean(unlist(df)), ]

#    a  b
#6   6 16
#7   7 17
#8   8 18
#9   9 19
#10 10 20