我有以下数据框叫做" test"包含基因名称和差异:
genes variance
RERE 0.27742
DLEC1 0.630556
RERE 0.45678
... ...
我想用每个基因的最大方差创建一个新的数据框:
genes variance
RERE 0.45678
DLEC1 0.630556
... ...
我试过了:
aggregate(test$variance, by = list(test$genes), max)
但得到此错误:
Error in Summary.factor(13308L, na.rm = FALSE): 'max' not meaningful for factors
我非常感谢任何建议。谢谢!
答案 0 :(得分:2)
如评论中所示,您似乎是"差异"您期望成为数字的列实际上是一个因素。
您可以在aggregate
命令本身中对其进行转换,但这可能是一个更明智的选择,可以首先弄清楚这种转换发生的原因。
aggregate(test$variance, by = list(test$genes), max)
# Error in Summary.factor(3L, na.rm = FALSE) :
# ‘max’ not meaningful for factors
aggregate(as.numeric(as.character(test$variance)),
by = list(test$genes), max)
# Group.1 x
# 1 DLEC1 0.630556
# 2 RERE 0.456780
test <- structure(
list(genes = c("RERE", "DLEC1", "RERE"),
variance = structure(
c(1L, 3L, 2L), .Label = c("0.27742", "0.45678", "0.630556"),
class = "factor")), .Names = c("genes", "variance"),
row.names = c(NA, -3L), class = "data.frame")