根据匹配值减去R中的行

时间:2020-06-26 08:21:14

标签: r

我正在尝试彼此减去数据集中的两行:

Name Period    Time     Distance  Load
Tim    A     01:06:20    6000     680
Max    A     01:06:20    5000     600
Leo    A     01:06:20    5500     640
Noa    A     01:06:20    6500     700
Tim    B     00:04:10    500      80
Max    B     00:04:10    500      50
Leo    B     00:04:10    400      40

我想从时段A中减去时段B的时间,距离和负载值以匹配名称。 例如。从第1行(Tim,周期A)中减去第5行(Tim,周期B) 新值应写入新表,如下所示:

Name Period    Time     Distance  Load
Tim    C     01:02:10    5500     600
Max    C     01:02:10    4500     550
Leo    C     01:02:10    5100     600
Noa    C     01:06:20    6500     700

实际数据集包含更多行。我尝试与dplyr玩耍,但无法获得我想要的结果。

预先感谢

5 个答案:

答案 0 :(得分:1)

您可以过滤两个时间段,然后将它们连接在一起,从而便于减去列。

library(dplyr)

inner_join(filter(df, Period=="A"), filter(df, Period=="B"), by="Name") %>%
  mutate(Period="C",
         Time=Time.x-Time.y,
         Distance=Distance.x-Distance.y,
         Load=Load.x-Load.y) %>%
  select(Name, Period, Time, Distance, Load)

  Name Period           Time Distance Load
1  Tim      C 1.036111 hours     5500  600
2  Max      C 1.036111 hours     4500  550
3  Leo      C 1.036111 hours     5100  600

答案 1 :(得分:1)

与@Edward基本相同。您可以使用dplyrtidyr

df %>%
  pivot_wider(names_from="Period", values_from=c("Time", "Distance", "Load")) %>%
  mutate(Period = "C",
         Time = coalesce(Time_A - Time_B, Time_A),
         Distance = coalesce(Distance_A - Distance_B, Distance_A),
         Load = coalesce(Load_A - Load_B, Load_A)
         ) %>%
  select(-matches("_\\w"))

返回

# A tibble: 4 x 5
  Name  Period Time     Distance  Load
  <chr> <chr>  <time>      <dbl> <dbl>
1 Tim   C      01:02:10     5500   600
2 Max   C      01:02:10     4500   550
3 Leo   C      01:02:10     5100   600
4 Noa   C      01:06:20     6500   700

数据

df <- read_table2("Name Period    Time     Distance  Load
Tim    A     01:06:20    6000     680
Max    A     01:06:20    5000     600
Leo    A     01:06:20    5500     640
Noa    A     01:06:20    6500     700
Tim    B     00:04:10    500      80
Max    B     00:04:10    500      50
Leo    B     00:04:10    400      40")

答案 2 :(得分:1)

这是将Name分组以得出差异的一种方法。

library(dplyr)
library(chron)

df <- structure(list(Name = structure(c(4L, 2L, 1L, 3L, 4L, 2L, 1L), .Label = c("Leo", "Max", "Noa", "Tim"), class = "factor"), 
                     Period = structure(c(1L,1L, 1L, 1L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"), 
                     Time = structure(c(2L, 2L, 2L, 2L, 1L, 1L, 1L), .Label = c("0:04:10", "1:06:20"), class = "factor"), 
                     Distance = c(6000L, 5000L, 5500L, 6500L, 500L, 500L, 400L), 
                     Load = c(680L, 600L, 640L, 700L, 80L, 50L, 40L)), class = "data.frame", row.names = c(NA, -7L))

df %>% 
  mutate(Time = times(Time)) %>% 
  group_by(Name) %>% 
  mutate(Time = lag(Time) - Time,
         Distance = lag(Distance) - Distance,
         Load = lag(Load) - Load,
         Period = LETTERS[which(LETTERS == Period) + 1]) %>% 
  filter(!is.na(Time))

答案 3 :(得分:1)

已经有很多答案,所以在这个阶段这只是一点乐趣。我认为这样很好,因为它使用了<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <link rel="stylesheet" href="style.css"> <title>Document</title> </head> <body> <h1>Black and White Image Converter</h1> <label for="user-input">Attach the the image you want to convert : </label> <input type="file" name="user-input" id="user-input" placeholder="Attach any image.." onchange="myFunction()"><!--Use this image--> <div class="image" id="background"/><!--I want to display that image as background of this div--> <script> function myFunction() { //runs when file input is changed var input = document.getElementById("user-input"); var file_reader; if (input.files && input.files[0]) { //only if there is one file in the input file_reader = new FileReader(); file_reader.onload = function(e) { document.getElementById("background").style = "position: relative; top: 50%; left: 50%; transform: translate(-50%,-50%); width: 400px; height: 400px;filter: grayscale(100%);" document.getElementById("background").style.backgroundImage = "url('"+ e.target.result +"')"; console.log(e.target.src); } file_reader.readAsDataURL(input.files[0]); } } </script> </body> </html>

unnest_wider()
library(dplyr)
library(tidyr)
library(purrr)

diff <- function(data) {
        if(apply(data[2, -1], 1, function(x) all(is.na(x)))) {
                data[1, -1]
        } else {
                data[1, -1] - data[2, -1]
        }
}

df %>% group_by(Name) %>% nest() %>%
        mutate(diff = map(data, diff)) %>% unnest_wider(diff) %>%
        mutate(Period = "C") %>% select(Period, Time, Distance, Load)

除了# A tibble: 4 x 5 Name Period Time Distance Load <chr> <chr> <time> <dbl> <dbl> 1 Tim C 01:02:10 5500 600 2 Max C 01:02:10 4500 550 3 Leo C 01:02:10 5100 600 4 Noa C 01:06:20 6500 700 函数(它可以变得更整洁和“排他” diff())之外,这种方法也更短。


数据

tidyverse

答案 4 :(得分:0)

您也可以使用data.table。

dt <- data.table(Name = c('Tim', 'Max', 'Leo', 'Noa', 'Tim', 'Max', 'Leo'),
             Period = c('A', 'A', 'A', 'A', 'B', 'B', 'B'), 
             Time = c('01:06:20', '01:06:20' , '01:06:20' , '01:06:20' , '00:04:10' , '00:04:10' , '00:04:10' ),
             Distance = c(6000, 5000, 5500, 6500, 500, 500, 400 ),
             Load = c(680, 600, 640, 700, 80, 50, 40))

然后要做的第一件事就是转换Time var:

dt[, Time := as.POSIXct(Time, format = "%H:%M:%S")]
sapply(dt, class)

然后您使用dcast.data.table:

dtCast <- dcast.data.table(dt, Name ~ Period, value.var = c('Time', 'Distance', 'Load'))

然后创建一个新对象:

dtFinal <- dtCast[,list(Period = 'C',
                        Time = Time_A - Time_B,
                        Distance = Distance_A - Distance_B,
                        Load = Load_A - Load_B),
                  by = 'Name']

请注意,如果要将时间转换为与上述相同的格式,则需要执行以下操作:

library(hms)
dtFinal[, Time := as_hms(Time)]