我尝试根据两个匹配条件将数据从其他数据框添加到另一个数据框。
第一个数据帧如下:
df.1<-cbind.data.frame(c("Site A","Site A","Site A","Site A","Site B","Site B","Site B","Site C","Site C"),c("Species 1","Species 2","Species 3","Species 5","Species 2","Species 3","Species 4","Species 1","Species 5"),c(10,15,5,10,30,10,20,10,5))
names(df.1)<-c("Location","Species","Count")
给出了:
Location Species Count
Site A Species 1 10
Site A Species 2 15
Site A Species 3 5
Site A Species 5 10
Site B Species 2 30
Site B Species 3 10
Site B Species 4 20
Site C Species 1 10
Site C Species 5 5
我的第二个数据框是:
df.2<-as.data.frame(matrix(0,nrow=3,ncol=5))
names(df.2)<-c("Species 1","Species 2","Species 3","Species 4","Species 5")
row.names(df.2)<-c("Site A","Site B","Site C")
给出了:
Species 1 Species 2 Species 3 Species 4 Species 5
Site A 0 0 0 0 0
Site B 0 0 0 0 0
Site C 0 0 0 0 0
我想根据位置和物种将第一个数据帧中的计数添加到第二个数据帧。它应该是这样的:
Species 1 Species 2 Species 3 Species 4 Species 5
Site A 10 15 5 0 10
Site B 0 30 10 20 0
Site C 10 0 0 0 15
我似乎无法做到这一点。问题似乎在于两个数据帧的大小不同。
即。我试过了:
df.2<-ifelse(row.names(df.2)==df.1$Location && names(df.2)==df.1$Species,df.1$Count,0)
但是得到以下错误:
警告讯息: 1:在is.na(e1)| is.na(e2): 较长的物体长度不是较短物体长度的倍数 2:在
==.default
(名称(df.2),df.1 $物种)中: 较长的物体长度不是较短物体长度的倍数
有没有人有解决方案?或者至少是一些合适方法的指导?
答案 0 :(得分:1)
您不需要第二个数据框,只需将Species
列扩展为宽格式,例如使用tidyr
:
library(tidyr)
df.1 %>%
spread(Species, Count) %>%
mutate_all(funs(replace(., is.na(.), 0)))
Location Species 1 Species 2 Species 3 Species 4 Species 5
1 Site A 10 15 5 0 10
2 Site B 0 30 10 20 0
3 Site C 10 0 0 0 5
mutate_all
函数用{0}替换NA
引入的所有spread
。
答案 1 :(得分:1)
或者是基础R
tapply(df.1$Count,list(df.1$Location,df.1$Species),"[")
Species 1 Species 2 Species 3 Species 4 Species 5
Site A 10 15 5 NA 10
Site B NA 30 10 20 NA
Site C 10 NA NA NA 5
如果您不想要NAs
df[is.na(df)] <- 0
答案 2 :(得分:1)
我们可以使用xtabs
base R
xtabs(Count ~ Location +Species, df.1)
# Species
#Location Species 1 Species 2 Species 3 Species 4 Species 5
# Site A 10 15 5 0 10
# Site B 0 30 10 20 0
# Site C 10 0 0 0 5