根据变量

时间:2015-04-24 08:39:36

标签: r dataframe row extract

我正在尝试为我的数据的每个块(层)提取第一条记录。 我想在每个块中提取负值(Mag)的第一个外观以及相应的时间。然后我想比较那些"次"从每个块中找到最小值和最大值。 (这是第一件事)

我来到某个地步但被困住了。任何帮助,包括缩短代码,将不胜感激。谢谢!

# to make sample data
data_neg<-seq(-0.98,-1,length=300)
data_pos<-seq(0.98,1,length=300)
time<-seq(1,54,length=600)

# binding those neg and pos numbers together
tot_num<- data.frame(c(rep(time, times=4)),c(rep(cbind(data_pos,data_neg),times=4)))    
colnames(tot_num)=c("time","Mag")

# split data into chunks
n <- 1:4  
dfchunk<- split(tot_num, factor(sort(rank(row.names(tot_num))%%n)))
ext_fsw<-lapply(dfchunk[],function(x)with(x,x[Mag<0,,drop=TRUE])) 
# here I want to exctract first appearance of negative value of Mag in each chunk together with corresponding time.

作为我问题的第二部分 在@ zx8754建议之后我试图读取我的真实数据 选择负值的第一个外观后进行循环并绘制结果。但我在实际数据中意识到有这样的N.A值(我从我的文件夹中读取了11个数据,因为你可以看到下面的代码......)

   X1      X2
1 27.45 -0.0111
2 43.29 -0.9746
3 32.49 -0.9807
4 28.08 -0.0538
5 28.44 -0.0669
 X1      X2
1 28.71 -0.0834
2 43.29 -0.9736
3 32.49 -0.9521
4 29.16 -0.0032
5 29.70 -0.0469
 X1      X2
1 30.06 -0.0112
2 43.29 -0.9724
3 35.37 -0.0448
4 33.03 -0.0308
5 31.59 -0.0055
 X1      X2
1 35.19 -0.0476
2 43.29 -0.9712
3 39.42 -0.0171
4 40.50 -0.0143
5 36.18 -0.0395
 X1      X2
1    NA      NA
2    NA      NA
3    NA      NA
4 50.85 -0.0371
5    NA      NA
   X1  X2
   1 NA      NA
2    NA      NA
3    NA      NA
4    NA      NA
5    NA      NA
   X1 X2
1    NA      NA
2    NA      NA
3    NA      NA
4    NA      NA
5    NA      NA
     X1     X2
1    NA     NA
2    NA     NA
3 49.77 -3e-04
4    NA     NA
5    NA     NA
     X1      X2
1    NA      NA
2    NA      NA
3    NA      NA
4 43.02 -0.0465
5 45.99 -0.9793
     X1      X2
1    NA      NA
2 37.98 -0.0005
3 45.18 -0.9784
4    NA      NA
5 45.09 -0.0551
     X1      X2
1    NA      NA
2    NA      NA
3 36.90 -0.0148
4 46.17 -0.9813
5    NA      NA

这里是循环读取我的数据

data.list <- dir(pattern = "*.avgm",full.names = FALSE) # creates the list    of all the csv files in the directory

a<-1:length(data.list)
for(k in 1:length(data.list)){
data1_stt<- read.table(data.list[k],colClasses="numeric",skip=0,   fill=FALSE, sep = "", quote="\"'", dec=".", as.is = TRUE, strip.white=FALSE)
StrL1<-data1_stt[,10]
time<-data1_stt[,1]*10^-3
tot_num<- data.frame(time,StrL1)
colnames(tot_num)=c("time","Mag")
n <- 5  # split data into chunks
dfchunk<- split(tot_num, factor(sort(rank(row.names(tot_num))%%n)))
ext_fsw<-lapply(dfchunk,function(x)x[which(x$Mag<0)[1],])#which - gives the index where the conditions is TRUE, then take the 1st value [1], pass it to x as index for rownumber.
x.n <- data.frame(matrix(unlist(ext_fsw),nrow=5, byrow=T))
print(x.n)
curr<-rep(c(8,7,6,5,4,3.6,3.8,4.2,4.4,4.6,4.8),each=5)
plot(curr,x.n,pch = 20) 
}

简而言之,我的任务的第二步是阅读我的所有数据并为其绘制 每个curr值。但我没有那样做。很遗憾,我无法在此处提供可重复的示例。由于数据中存在N.A值,因此在长度值方面,长度不同。

1 个答案:

答案 0 :(得分:1)

试试这个:

ext_fsw<-lapply(dfchunk,function(x)
  x[which(x$Mag<0)[1],]
  )

which - 给出条件为TRUE的索引,然后取第一个值[1],将其作为rownumber的索引传递给x