将每日行的矩阵组合成每周行

时间:2013-01-18 21:00:27

标签: r

我有一个矩阵,其中日期为行名,TAG为列名。对于存在/不存在,矩阵填充0和1。 例如

           29735 29736 29737 29738 29739 29740
2010-07-15     1     0     0     0     0     0
2010-07-16     1     1     0     0     0     0
2010-07-17     1     1     0     0     0     0
2010-07-18     1     1     0     0     0     0
2010-07-19     1     1     0     0     0     0
2010-07-20     1     1     0     0     0     0

我有以下用于计算网站保真度的脚本(%天数):

##Presence/absence data setup
##import file
read.csv('pn.csv')->'pn'
##strip out desired columns
pn[,c(5,7:9)]->pn
##create table of dates and tags
table(pn$Date,pn$Tag)->T
##convert to a matrix
as.matrix(T)->U
##convert to binary for presence/absence
1*(U>2)->U

##insert missing rows
library(micEcon)
insertRow(U,395,0)->U
rownames(U)[395]<-'2011-08-16'
insertRow(U,253,0)->U
rownames(U)[253]<-'2011-03-26'
insertRow(U,250,0)->U
rownames(U)[250]<-'2011-03-22'
insertRow(U,250,0)->U
rownames(U)[250]<-'2011-03-21'

##for presence/absence
##define i(tag or column)
1->i
##define place to store results
cbind(colnames(U),rep(NA,length(colnames(U))))->sfresult
##loop instructions
for(i in 1:ncol(U)){
##identify first detection day
grep(1,U[,i])[1]->tagrow
##count total days since first detection
nrow(U)-tagrow+1->days
##count days present
length(grep(1,U[,i]))->present
##calculate site fidelity
present/days->sfresult[i,2]
}
##change class of results column
as.numeric(sfresult[,2])->sfresult[,2]
##histogram
bins<-c(0,.3,.6,1)
xlab<-c('Low','Med','High')
hist(as.numeric(sfresult[,2]), breaks=bins,xaxt='n', col=heat.colors(3), xlab='Percent      Days Present',ylab='Frequency (# of individuals)',main='Site Fidelity',freq=TRUE,labels=xlab)
axis(1,at=bins)

我想每周计算网站保真度。我相信通过将每七行组合成一个每周矩阵来简单地折叠矩阵是最简单的,该矩阵简单地将每日矩阵中的0和1相加。然后,相同的网站保真度脚本将每周计算一次。问题是我是新手,我很难找到如何将每日矩阵折叠成每周矩阵的答案。感谢您的任何建议。

2 个答案:

答案 0 :(得分:2)

这样的事情应该有效:

x <- matrix(rbinom(1000,1,.2), nrow=50, ncol=20)
rownames(x) <- 1:50
colnames(x) <- paste0("id", 1:20)

require(data.table)
xdt <- as.data.table(x)

    ##assuming rows are sorted by date, that there are no missing days, and that the first row is the start of the week
    ###xdt[, week:=sort(rep(1:7, length.out=nrow(xdt)))] ##wrong

   xdt[, week:=rep(1:ceiling(nrow(xdt)/7), each=7)] ##fixed


xdt[, lapply(.SD,sum), by="week",.SDcols=setdiff(names(xdt),"week")]

如果您提供可重现的示例How to make a great R reproducible example?

,我可以帮助您更好地保留rownames

编辑: 此外,正如您在上面所做的那样,使用正确的作业->非常不典型。

答案 1 :(得分:0)

R cut函数会将Date修剪为一周(有关详细信息,请参阅?cut.Date)。之后,只需调用aggregate即可获得所需的结果。请注意,cut.Date需要start.on.monday选项。

数据

sites <- read.table(text="29735 29736 29737 29738 29739 29740
  2010-07-15     1     0     0     0     0     0
  2010-07-16     1     1     0     0     0     0
  2010-07-17     1     1     0     0     0     0
  2010-07-18     1     1     0     0     0     0
  2010-07-19     1     1     0     0     0     0
  2010-07-20     1     1     0     0     0     0", 
  header=TRUE, check.names=FALSE, row.names=1)

答案

weeks.factor <- cut(as.Date(row.names(sites)), 
                    breaks='weeks', start.on.monday=FALSE)
aggregate(sites, by=list(weeks.factor), FUN=function(col) sum(col)/length(col))

#      Group.1 29735     29736 29737 29738 29739 29740
# 1 2010-07-11     1 0.6666667     0     0     0     0
# 2 2010-07-18     1 1.0000000     0     0     0     0