在R中按组添加日期戳记观察

时间:2018-05-25 12:27:50

标签: r dplyr

我有一个如下所示的数据框:

> head(dsidata3)
# A tibble: 6 x 28
  Date      `Day of week` Holiday Name   `Time entered` Work  Travel Exercise Sleep
  <chr>     <chr>         <chr>   <chr>  <time>         <chr> <chr>  <chr>    <dbl>
1 28/3/2018 Wednesday     NA      Dave   21:10          6.0   0.4    -         7.00
2 28/3/2018 Wednesday     NA      Mercu… 22:00          8.0   1.5    -         6.00
3 28/3/2018 Wednesday     NA      Mars   23:56          11.0  1.0    -         4.00
4 28/3/2018 Wednesday     NA      Venus  22:35          8.5   4.0    -         7.50
5 29/3/2018 Thursday      NA      Dave   22:00          -     -      -         6.00
6 29/3/2018 Thursday      NA      Mercu…    NA          8.5   0.8    1.0      10.0

对于每个日期,有四个观察结果(每个$ Name,'Dave','Mars'等)

我还有一个单独的数据框,看起来像这样

    > head(windspeeds)
# A tibble: 6 x 2
  Date       `km/h`
  <chr>       <int>
1 28/03/2018      2
2 29/03/2018      1
3 30/03/2018      0
4 31/03/2018      2
5 1/04/2018       1
6 2/04/2018       7

我想将风速数据添加到我的第一个数据帧中,但是该数据帧中每个日期有四个,而风速数据帧中每个日期只有一个观察值。

我确定这与嵌套和应用有关但我无法弄明白,对此有任何帮助将不胜感激!

这里要求的是这些变量的所有观察结果:

> dput(dsidata3$Date)
c("28/3/2018", "28/3/2018", "28/3/2018", "28/3/2018", "29/3/2018", 
"29/3/2018", "29/3/2018", "29/3/2018", "30/3/2018", "30/3/2018", 
"30/3/2018", "30/3/2018", "31/3/2018", "31/3/2018", "31/3/2018", 
"31/3/2018", "1/4/2018", "1/4/2018", "1/4/2018", "1/4/2018", 
"2/4/2018", "2/4/2018", "2/4/2018", "2/4/2018", "3/4/2018", "3/4/2018", 
"3/4/2018", "3/4/2018", "4/4/2018", "4/4/2018", "4/4/2018", "4/4/2018", 
"5/4/2018", "5/4/2018", "5/4/2018", "5/4/2018", "6/4/2018", "6/4/2018", 
"6/4/2018", "6/4/2018", "7/4/2018", "7/4/2018", "7/4/2018", "7/4/2018", 
"8/4/2018", "8/4/2018", "8/4/2018", "8/4/2018", "9/4/2018", "9/4/2018", 
"9/4/2018", "9/4/2018", "10/4/2018", "10/4/2018", "10/4/2018", 
"10/4/2018", "11/4/2018", "11/4/2018", "11/4/2018", "11/4/2018", 
"12/4/2018", "12/4/2018", "12/4/2018", "12/4/2018", "13/4/2018", 
"13/4/2018", "13/4/2018", "13/4/2018", "14/4/2018", "14/4/2018", 
"14/4/2018", "14/4/2018", "15/4/2018", "15/4/2018", "15/4/2018", 
"15/4/2018", "16/4/2018", "16/4/2018", "16/4/2018", "16/4/2018", 
"17/4/2018", "17/4/2018", "17/4/2018", "17/4/2018", "18/4/2018", 
"18/4/2018", "18/4/2018", "18/4/2018", "19/4/2018", "19/4/2018", 
"19/4/2018", "19/4/2018", "20/4/2018", "20/4/2018", "20/4/2018", 
"20/4/2018", "21/4/2018", "21/4/2018", "21/4/2018", "21/4/2018", 
"22/4/2018", "22/4/2018", "22/4/2018", "22/4/2018", "23/4/2018", 
"23/4/2018", "23/4/2018", "23/4/2018", "24/4/2018", "24/4/2018", 
"24/4/2018", "24/4/2018", "25/4/2018", "25/4/2018", "25/4/2018", 
"25/4/2018", "26/4/2018", "26/4/2018", "26/4/2018", "26/4/2018", 
"27/4/2018", "27/4/2018", "27/4/2018", "27/4/2018", "28/4/2018", 
"28/4/2018", "28/4/2018", "28/4/2018", "29/4/2018", "29/4/2018", 
"29/4/2018", "29/4/2018", "30/4/2018", "30/4/2018", "30/4/2018", 
"30/4/2018", "1/5/2018", "1/5/2018", "1/5/2018", "1/5/2018", 
"2/5/2018", "2/5/2018", "2/5/2018", "2/5/2018", "3/5/2018", "3/5/2018", 
"3/5/2018", "3/5/2018", "4/5/2018", "4/5/2018", "4/5/2018", "4/5/2018", 
"5/5/2018", "5/5/2018", "5/5/2018", "5/5/2018", "6/5/2018", "6/5/2018", 
"6/5/2018", "6/5/2018", "7/5/2018", "7/5/2018", "7/5/2018", "7/5/2018", 
"8/5/2018", "8/5/2018", "8/5/2018", "8/5/2018")

风速:

> dput(windspeeds)
structure(list(Date = c("28/03/2018", "29/03/2018", "30/03/2018", 
"31/03/2018", "1/04/2018", "2/04/2018", "3/04/2018", "4/04/2018", 
"5/04/2018", "6/04/2018", "7/04/2018", "8/04/2018", "9/04/2018", 
"10/04/2018", "11/04/2018", "12/04/2018", "13/04/2018", "14/04/2018", 
"15/04/2018", "16/04/2018", "17/04/2018", "18/04/2018", "19/04/2018", 
"20/04/2018", "21/04/2018", "22/04/2018", "23/04/2018", "24/04/2018", 
"25/04/2018", "26/04/2018", "27/04/2018", "28/04/2018", "29/04/2018", 
"30/04/2018", "1/05/2018", "2/05/2018", "3/05/2018", "4/05/2018", 
"5/05/2018", "6/05/2018", "7/05/2018", "8/05/2018"), `km/h` = c(2L, 
1L, 0L, 2L, 1L, 7L, 7L, 6L, 1L, 7L, 5L, 5L, 1L, 5L, 0L, 0L, 1L, 
3L, 6L, 1L, 6L, 6L, 6L, 3L, 3L, 1L, 1L, 1L, 7L, 7L, 5L, 7L, 3L, 
4L, 2L, 7L, 1L, 5L, 0L, 0L, 0L, 7L)), .Names = c("Date", "km/h"
), row.names = c(NA, -42L), class = c("tbl_df", "tbl", "data.frame"

2 个答案:

答案 0 :(得分:1)

将输入视为:

x1 <- 'A B
1 x
1 y
1 z
2 r
2 t
2 5'

x2 <- 'A D
1 x1
2 r1'

df1 <- read.table(text = x1, sep =" ", header = TRUE, stringsAsFactors = FALSE)
df2 <- read.table(text = x2, sep =" ", header = TRUE, stringsAsFactors = FALSE)

您可以尝试tidyverse这样的功能:

df1 %>%
  left_join(df2)

给出:

  A B  D
1 1 x x1
2 1 y x1
3 1 z x1
4 2 r r1
5 2 t r1
6 2 5 r1

答案 1 :(得分:0)

您可以使用带有参数rep的函数each=4重复4次风速输入,然后将其添加到数据框中。

temp <- as.array(windspeeds["km/h"])
dsidata3["ws"]<- rep(temp, each = 4)