这似乎不应该太难,但我很难用这个。让我们说例如我有以下数据框:
set.seed(99)
data <- data.frame(Names=rep(c('A','B'),5),
First = rnorm(10),
Second = rnorm(10),
Third = rnorm(10))
我想要的是将整个数据帧除以A的平均值。这可以通过以下方式计算:
a.mean&lt; - sapply(data [data $ Names =='A',2:4],mean)
但是当我尝试用这样的矢量划分整个数据帧时,我没有得到正确的值:
normalized.data <- data[2:4]/a.mean
normalized.data$Names <- data$Names
sapply(data[normalized.data$Names == 'A', 2:4], mean)
First Second Third
0.2578018 -0.5864073 0.1156760
我想要的是A的归一化平均值,现在等于1.有没有办法做到这一点?
答案 0 :(得分:4)
set.seed(99)
data <- data.frame(Names=rep(c('A','B'),5),
First = rnorm(10),
Second = rnorm(10),
Third = rnorm(10))
a.mean <- sapply(data[data$Names == 'A', 2:4], mean)
data[,2:4] <- sweep(data[,2:4],MARGIN=2,a.mean,"/")
(norm.mean <- sapply(data[data$Names == 'A', 2:4], mean))
## First Second Third
## 1 1 1
根据您的应用程序,可能更容易使Names
列成为行名称:
data <- data.frame(First = rnorm(10),
Second = rnorm(10),
Third = rnorm(10),
row.names=rep(c('A','B'),5))
我也喜欢subset(data,Names=='A')
的可读性(虽然不建议用于编程:请参阅https://github.com/hadley/devtools/wiki/Evaluation)
答案 1 :(得分:2)
set.seed(99)
data <- data.frame(Names=rep(c('A','B'),5),
First = rnorm(10),
Second = rnorm(10),
Third = rnorm(10))
a.mean <- colMeans(data[data$Names == 'A', 2:4])
normalized.data <- as.data.frame(t(t(data[,2:4])/a.mean))
normalized.data$Names <- data$Names
colMeans(normalized.data[normalized.data$Names == 'A', 1:3])
#First Second Third
#1 1 1
答案 2 :(得分:2)
哦,没关系:你没有按照自己的方式划分。将矩阵除以值向量不会将每列除以给定值。
Rgames> foo
[,1] [,2] [,3]
[1,] 5 3 7
[2,] 5 3 7
[3,] 5 3 7
[4,] 5 3 7
[5,] 5 3 7
Rgames> foo/c(1,2,3)
[,1] [,2] [,3]
[1,] 5.000000 1.0 3.500000
[2,] 2.500000 3.0 2.333333
[3,] 1.666667 1.5 7.000000
[4,] 5.000000 1.0 3.500000
[5,] 2.500000 3.0 2.333333