计算数组中值的更改次数

时间:2016-06-16 14:45:20

标签: arrays count

我在R中相对较新,我正在处理动物的行为数据,我正在尝试确定个体动物在给定时间范围内改变其行为的次数(在这种情况下为会话)。

我的虚拟数据集如下: -

session = c(1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2)
activity = c("V","F","D","F","F","W","V","R","R","S","V","U","W","V","V","V","R","R","R","R")
df = data.frame(session,activity)

我想计算每个会话中活动更改的次数。例如,在会话1中它将是8次,在会话2中它将是5次。 我已尝试在互联网上使用rle()的其他建议的选项,但我想知道如何编码它,因为在大多数情况下,它将在给定数组中总结不同的模态。

2 个答案:

答案 0 :(得分:0)

我从一位朋友那里发现,我必须将我的dle转到我的df的“活动”列,确保此列是一个字符,而不是一个向量df$activity=as.character(df$activity) 然后我将该函数仅应用于单个会话的行,例如会话1的行:

res<-rle(df[which(df$session==1),2])#rle() function applied to the activity column of df and to the rows of the session 1
length(res$lengths)# will give you the number of changes within a session

但要将它应用于大数据集,我可以将它应用于循环:

df[,2]=as.character(df[,2])# to treat session as a character
ls.session=unique(df$session)
nb.session=length(ls.session)
new.df=data.frame(ls.session,rep(0,nb.session))#create an empty data.frame where we can apply the loop
names(new.df)=c("session","nb.change")
for(i in 1:nb.session){
res.rle.sess.i=rle(df[which(df$session==ls.session[i]),2])
nb.chang.sess.i=length(res.rle.sess.i$lengths)
new.df[i,2]=nb.chang.sess.i
}
new.df

答案 1 :(得分:0)

change.f = function(x) c(FALSE, x[-1] != x[-length(x)])
aggregate(change.f(df$activity)&!change.f(df$session), by=list(df$session), FUN=sum)

输出:

  Group.1 x
1       1 7
2       2 4