我有两个数据框,一个包含调查日期的详细信息,另一个包含记录个人实例的数据框。如下:
Records <- data.frame("Location"=c("A","A","B","C","C","C","D"),
"Date"= c("09/01/2017","12/01/2017","20/01/2017","06/06/2017","03/06/2017","19/01/2017","02/01/2017"),
"Individuals"= c(3,2,6,4,0,1,6))
Surveys <- data.frame("Location"=c("A","B","C","D","A","B","C","D"),
"Start"= c(rep("01/01/2017",length=4),rep("01/06/2017",length=4)),
"End"= c(rep("01/02/2017",length=4),rep("01/07/2017",length=4)))
> Surveys
Location Start End
1 A 01/01/2017 01/02/2017
2 B 01/01/2017 01/02/2017
3 C 01/01/2017 01/02/2017
4 D 01/01/2017 01/02/2017
5 A 01/06/2017 01/07/2017
6 B 01/06/2017 01/07/2017
7 C 01/06/2017 01/07/2017
8 D 01/06/2017 01/07/2017
> Records
Location Date Individuals
1 A 09/01/2017 3
2 A 12/01/2017 2
3 B 20/01/2017 6
4 C 06/06/2017 4
5 C 03/06/2017 0
6 C 19/01/2017 1
7 D 02/01/2017 6
我希望在调查数据框中添加一个列,该列可以汇总在该站点和相关时间段内发生的个人数量。结果如下:
Sum.Individuals <- c(5,6,1,6,0,0,4,0)
Final <- cbind(Surveys,Sum.Individuals)
> Final
Location Start End Total.Individuals
1 A 01/01/2017 01/02/2017 5
2 B 01/01/2017 01/02/2017 6
3 C 01/01/2017 01/02/2017 1
4 D 01/01/2017 01/02/2017 6
5 A 01/06/2017 01/07/2017 0
6 B 01/06/2017 01/07/2017 0
7 C 01/06/2017 01/07/2017 4
8 D 01/06/2017 01/07/2017 0
我希望这是有道理的,任何帮助都会受到赞赏。
干杯
答案 0 :(得分:0)
我建议采取以下步骤:
所以可能会像:
library(tidyverse)
library(magrittr)
df <- inner_join(surveys,records,by='Location')
df %<>% filter(Date >= Start, Date <= End)
df %<>% group_by(Location,Start,End) %>% summarise(totalindividuals=sum(individuals)
希望有所帮助。如果您发现代码令人困惑,您可能希望探索加入和过滤的概念,以及与这些任务非常相关的dplyr
包。