我对R很新,并提前道歉我的帖子不是通常的格式(我尝试使用dput()
但得到一个奇怪的输出,不知道如何上传数据集我很抱歉)。
我有一个包含6个colums的数据集(site,startdate,enddate,photodate,species,indiv)。例如:
site year startdate enddate photodate species indiv
M1_7 2012 19/07/2012 10/08/2012 20/07/2012 Sylvicapra grimmia 1
M1_7 2012 19/07/2012 10/08/2012 23/07/2012 Crocuta crocuta 1
M1_7 2012 19/07/2012 10/08/2012 23/07/2012 Potamochoerus larvatus 1
M1_7 2012 19/07/2012 10/08/2012 25/07/2012 Hystrix cristata 1
M1_7 2012 19/07/2012 10/08/2012 27/07/2012 Potamochoerus larvatus 1
M1_7 2012 19/07/2012 10/08/2012 27/07/2012 Sylvicapra grimmia 1
M1_7 2012 19/07/2012 10/08/2012 28/07/2012 Hippotragus equinus 1
M1_7 2012 19/07/2012 10/08/2012 30/07/2012 Crocuta crocuta 1
M1_7 2012 19/07/2012 10/08/2012 01/08/2012 Equus q. boehmi 1
M1_7 2012 19/07/2012 10/08/2012 01/08/2012 Crocuta crocuta 1
M1_7 2012 19/07/2012 10/08/2012 05/08/2012 Potamochoerus larvatus 1
M1_7 2012 19/07/2012 10/08/2012 07/08/2012 Hippotragus equinus 1
M1_9 2012 21/07/2012 11/08/2012 24/07/2012 Pedetes capensis 1
M1_9 2012 21/07/2012 11/08/2012 24/07/2012 Crocuta crocuta 2
M1_9 2012 21/07/2012 11/08/2012 24/07/2012 Pedetes capensis 1
M1_9 2012 21/07/2012 11/08/2012 27/07/2012 Pedetes capensis 1
M1_9 2012 21/07/2012 11/08/2012 01/08/2012 Alcelaphus b. lichtensteinii 1
M1_9 2012 21/07/2012 11/08/2012 03/08/2012 Pedetes capensis 1
M1_9 2012 21/07/2012 11/08/2012 04/08/2012 Crocuta crocuta 1
M1_9 2012 21/07/2012 11/08/2012 06/08/2012 Pedetes capensis 1
M1_9 2012 21/07/2012 11/08/2012 07/08/2012 Pedetes capensis 1
M1_9 2012 21/07/2012 11/08/2012 08/08/2012 Pedetes capensis 1
M1_11 2012 21/07/2012 11/08/2012 26/07/2012 Mellivora capensis 1
M1_11 2012 21/07/2012 11/08/2012 03/08/2012 Sylvicapra grimmia 1
M1_11 2012 21/07/2012 11/08/2012 07/08/2012 Hystrix cristata 1
M1_11 2012 21/07/2012 11/08/2012 08/08/2012 Potamochoerus larvatus 1
我一直在尝试编写一个循环,创建一个49列矩阵,其中第1列对应于站点,第2列为站点内“startdate”和“enddate”之间的日期序列,第3:49列到物种名称。在第3:49列的单元格中,我想在特定日期为特定物种总结计数数据(indiv)得出的数据。
到目前为止,我只能创建一个对应于我想要的空矩阵,但无法填写数据。这是我用过的代码:
mlele2012<- read.delim("C:\\multiple regression\\mlele 2012 empty matrix creation.txt")
africa <- read.delim("C:\\species accumulation curves\\COMPLETE species list.txt")
specieslistx<-unique(africa)
specieslistx<-t(specieslistx)
oldtemp<-NULL
temp <- rep(0, length(specieslistx ))
strptime(mlele2012$photodate, "%Y-%m-%d")
strptime(mlele2012$startdate, "%d/%m/%Y")
strptime(mlele2012$enddate, "%d/%m/%Y")
#create empty dataframe with dimensions: no. of sites x no. of dates in each
for(i in levels(mlele2012$site)) { ##for each site
sitetemp <- subset(mlele2012, site == i) ###subset of dataset , for the particular site i##
sitetemp$startdate<- as.Date(sitetemp$startdate, "%d/%m/%Y")
sitetemp$enddate<- as.Date(sitetemp$enddate, "%d/%m/%Y")
sitedatelist<-seq(as.Date(sitetemp$startdate[1]), as.Date(sitetemp$enddate[1]), "days")
empty<-matrix(0,length(sitedatelist),length(specieslistx))
sitedatelist1<-as.character(sitedatelist)
row.names(empty)<-(sitedatelist1)
colnames(empty)<-specieslistx
addsitecol<-matrix(0,length(sitedatelist),1)
extendempty<-cbind(addsitecol,empty)
extendempty[,1]<-i
oldtemp<-rbind(oldtemp, extendempty)
}
write.csv(oldtemp, "Mlele 2012 dry empty.csv")
此外,我一直试图提取以相同的格式/维度创建另一个矩阵,但没有多余的日期(即只有“photodate”列中的日期而不是“startdate”和“enddate”之间的序列)。我希望我最终能以某种方式合并两个矩阵以获得我最终需要的东西。不幸的是,这段代码不起作用,尽管似乎没有错误。这是我的代码的第二部分:
for(i in mlele2012$site) {
sitetemp <- subset(mlele2012, site == i) ###subset of dataset "allsites", for the particular site i##
for(j in sitetemp$photodate){
datetemp <- subset(sitetemp, photodate == j) ###subset of dataset "africaa", for the particular date i#
uniquespperdate <- unique(datetemp$species)###unique species within each date (row) i#
temp <- rep(0, length(specieslistx)) #create a temporary vector of 0s with the same length as the species list###
for(a in uniquespperdate){
sptemp <- subset(datetemp , species == a) ###subset of dataset "sitetemp", for the particular sp j##
countdata<-sum(sptemp$indiv)
index <- pmatch(a, names(temp)) ###match the unique species per date to the location on the species list###
#there is a problem here, it works when run as a single line but not within a loop
temp[index] <- countdata ###for the locations listed in "index", assign the count data to the temporary vector###
names(temp)<- specieslistx
}
}
oldtemp <- rbind(oldtemp, temp) ### bind the new temp file to the old temp file, i.e. update the list as the loop runs###
}
非常感谢任何帮助。如果有任何细节可以让我更清楚,请告诉我。
答案 0 :(得分:1)
我可以通过以下方式获得样品的大部分内容:
> ftable(xtabs(indiv~site+year+species, data=dat) )
species boehmi capensis cristata crocuta equinus grimmia larvatus lichtensteinii
site year
M1_11 2012 0 1 1 0 0 1 1 0
M1_7 2012 1 0 1 3 2 2 3 0
M1_9 2012 0 7 0 3 0 0 0 1
我确实使用genus / species作为两列输入数据,因为您没有提供请求的dput版本。
答案 1 :(得分:0)
有点乱,但没有初始化空矩阵,您可以执行以下操作:
如果df
是您的初始数据:
result = do.call("rbind",lapply(levels(df$site),function(x){
do.call("rbind",lapply(levels(df$startdate),function(y){
do.call("rbind",lapply(levels(df$enddate),function(z){
foo <- rep(0,length(levels(df$species)))
names(foo) <- levels(df$species)
foo[df$species[df$site==x & df$startdate==y & df$enddate==z]] <- df$indiv[df$site==x & df$startdate==y & df$enddate==z]
c(x,y,z,foo)
}))
}))
}))
result
应包含您寻求的矩阵(我希望)。