根据另一个数据框中的值汇总一个数据框中的数据

时间:2017-07-31 16:46:30

标签: r date join summary

我有两个数据框,一个包含调查日期的详细信息,另一个包含记录个人实例的数据框。如下:

Records <- data.frame("Location"=c("A","A","B","C","C","C","D"),
                      "Date"= c("09/01/2017","12/01/2017","20/01/2017","06/06/2017","03/06/2017","19/01/2017","02/01/2017"),
                      "Individuals"= c(3,2,6,4,0,1,6))
Surveys <- data.frame("Location"=c("A","B","C","D","A","B","C","D"),
                     "Start"= c(rep("01/01/2017",length=4),rep("01/06/2017",length=4)),
                      "End"= c(rep("01/02/2017",length=4),rep("01/07/2017",length=4)))

> Surveys
  Location      Start        End
1        A 01/01/2017 01/02/2017
2        B 01/01/2017 01/02/2017
3        C 01/01/2017 01/02/2017
4        D 01/01/2017 01/02/2017
5        A 01/06/2017 01/07/2017
6        B 01/06/2017 01/07/2017
7        C 01/06/2017 01/07/2017
8        D 01/06/2017 01/07/2017
> Records
  Location       Date Individuals
1        A 09/01/2017           3
2        A 12/01/2017           2
3        B 20/01/2017           6
4        C 06/06/2017           4
5        C 03/06/2017           0
6        C 19/01/2017           1
7        D 02/01/2017           6

我希望在调查数据框中添加一个列,该列可以汇总在该站点和相关时间段内发生的个人数量。结果如下:

Sum.Individuals <- c(5,6,1,6,0,0,4,0)
Final <- cbind(Surveys,Sum.Individuals)

> Final
  Location      Start        End Total.Individuals
1        A 01/01/2017 01/02/2017                 5
2        B 01/01/2017 01/02/2017                 6
3        C 01/01/2017 01/02/2017                 1
4        D 01/01/2017 01/02/2017                 6
5        A 01/06/2017 01/07/2017                 0
6        B 01/06/2017 01/07/2017                 0
7        C 01/06/2017 01/07/2017                 4
8        D 01/06/2017 01/07/2017                 0

我希望这是有道理的,任何帮助都会受到赞赏。

干杯

1 个答案:

答案 0 :(得分:0)

我建议采取以下步骤:

  1. 在位置加入两个表格
  2. 过滤日期在开始和结束之间
  3. 按位置,开始和结束分组;个人总和。
  4. 所以可能会像:

    library(tidyverse)
    library(magrittr)
    df <- inner_join(surveys,records,by='Location')
    df %<>% filter(Date >= Start, Date <= End)
    df %<>% group_by(Location,Start,End) %>% summarise(totalindividuals=sum(individuals)
    
    希望有所帮助。如果您发现代码令人困惑,您可能希望探索加入和过滤的概念,以及与这些任务非常相关的dplyr包。