循环遍历R中的数据帧

时间:2015-08-31 11:07:15

标签: r

我有一个数据框,我试图分别计算每个组的中位数。当我将数据框分成两组并计算每个数据框的中位数时,我得到NA结果。

数据是:

    x1  x2  x3  x4  x5  x6  x7  y1  y2  y3  y4  y5  y6  y7  y8
9.488404158 9.470895414 9.282433728 9.366707445 9.955383045 9.640816474   9.606262272   9.329651027 9.434541611 9.473922432 9.311412966 9.3154885   9.434977488 9.470895414 9.764258059
8.630629966 8.55831075  8.788391003 8.576231135 8.671587906 8.842979993 8.861958856 8.58330436  8.603596508 8.570129609 8.59798922  8.572686772 8.679751791 8.663950953 8.432875347
9.354748885 9.367668838 9.259952558 9.421538213 9.554635162 9.603744578 9.452197983 9.284228877 9.404607878 9.317737979 9.343115301 9.310644266 9.27227486  9.360337823 9.44706281
9.944863964 9.950427516 10.19101759 10.07350804 10.03269879 10.1307908  10.03487287 9.74609383  9.886379007 9.775472567 10.036596   9.544738458 9.699611598 9.911962567 9.625804277

代码:

  rowN <- nrow(AT1)
  MD1<-vector(length=rowN)
  MD2<-vector(length=rowN)

   MD1[1:rowN]<-NA
   MD2[1:rowN]<-NA


 x<- AT1[,c(2,3,4,5,6,7,8) ]
  write.csv(x,"x.csv",row.names=TRUE)
  x<-as.matrix(x)
  for(i in 2:rowN) { 
  MD1[i]=median(x[i,])
  }
 write.csv(MD1,"MD1.csv",row.names=TRUE)

  y<- AT1[,c(9,10,11,12,13,14,15,16)]
  write.csv(y,"y.csv",row.names=TRUE)
  y<-as.matrix(y)
  for(j in 2:rowN) {
  MD2[j]=median(y[j,])
  }
  write.csv(MD2,"MD2.csv",row.names=TRUE)

1 个答案:

答案 0 :(得分:3)

展示一个可重复的例子会更好。基于循环代码,在我看来,OP想要获得每行的median。假设median分别为第2:8列和第9:16列计算,我们会转换&#39; data.frame&#39;到&#39;矩阵&#39; (as.matrix)并使用rowMedians中的library(matrixStats)

x1 <- as.matrix(AT1[2:8 ])
x2 <- as.matrix(AT1[9:16])

library(matrixStats)
rowMedians(x1, na.rm=TRUE)
#[1] -0.09411013 -0.08554095  0.11953107 -0.26869311  0.33224445

rowMedians(x2, na.rm=TRUE)
#[1]  0.10557881 -0.74135403 -0.05876725  0.69230776 -0.21402339

数据

set.seed(24)
m1 <- matrix(rnorm(5*15), ncol=15)
AT1 <- data.frame(col1= LETTERS[1:5], m1)