我创建了一个变量,根据数据框架将物种群体描述为国内,野生或异国情调,其中每一行代表在独特地点(siteID)中找到的物种。我想通过每个siteID将行插入到我的数据框中,以便为在该站点上未观察到的组报告0。换句话说,这就是我拥有的数据框架:
df.start <- data.frame(species = c("dog","deer","toucan","dog","deer","toucan"),
siteID = c("a","b","b","c","c","c"),
group = c("domestic", "wild", "exotic", "domestic", "wild", "exotic"),
value = c(2:7))
df.start
# species siteID group value
# 1 dog a domestic 2
# 2 deer b wild 3
# 3 toucan b exotic 4
# 4 dog c domestic 5
# 5 deer c wild 6
# 6 toucan c exotic 7
这是我想要的数据框:
df.end <-data.frame(species=c("dog","NA","NA","NA","deer",
"toucan","dog","deer","toucan"),
siteID = c("a","a","a","b","b","b","c","c","c"),
group = rep(c("domestic", "wild", "exotic"),3),
value = c(2,0,0,0,3,4,5,6,7))
df.end
# species siteID group value
# 1 dog a domestic 2
# 2 NA a wild 0
# 3 NA a exotic 0
# 4 NA b domestic 0
# 5 deer b wild 3
# 6 toucan b exotic 4
# 7 dog c domestic 5
# 8 deer c wild 6
# 9 toucan c exotic 7
这是因为我想使用plyr函数按组总结平均值,并且我意识到某些组网站组合缺少零并且夸大了我的估计。也许我错过了一个更明显的解决方法?
答案 0 :(得分:1)
使用基本R函数:
result <- merge(
with(df.start, expand.grid(siteID=unique(siteID),group=unique(group))),
df.start,
by=c("siteID","group"),
all.x=TRUE
)
result$value[is.na(result$value)] <- 0
> result
siteID group species value
1 a domestic dog 2
2 a exotic <NA> 0
3 a wild <NA> 0
4 b domestic <NA> 0
5 b exotic toucan 4
6 b wild deer 3
7 c domestic dog 5
8 c exotic toucan 7
9 c wild deer 6
答案 1 :(得分:1)
df.sg <- data.frame(xtabs(value~siteID+group, data=df.start))
merge(df.start[-4], df.sg, by=c("siteID", "group"), all.y=TRUE)
#-------------
siteID group species Freq
1 a domestic dog 2
2 a exotic <NA> 0
3 a wild <NA> 0
4 b domestic <NA> 0
5 b exotic toucan 4
6 b wild deer 3
7 c domestic dog 5
8 c exotic toucan 7
9 c wild deer 6
xtabs
函数返回一个表,该表允许as.data.frame.table
方法对其进行处理。非常方便。