润滑:间隔按组重叠

时间:2018-12-03 23:43:14

标签: intervals lubridate overlapping

您好,非常感谢!

我正在尝试确定哪些间隔与组中的其他间隔重叠。

例如,如果我们有以下数据:

    function updateOrganization(db, stats) {
  return function (req, res) {
    db.organization.findOne({
      where: {
        id: req.params.id
      }
    })
      .then(result => {

        if (!result) {
          res.status(HttpStatus.NOT_FOUND).end();
          return null;
        }

        console.log('original result = ', result)

        // Update any fields that were passed in
        if (req.body.name) {
          result.name = req.body.name
        }

        if (req.body.address1) {
          result.address1 = req.body.address1
        }

        if (req.body.address2) {
          result.address2 = req.body.address2
        }

        if (req.body.city) {
          result.city = req.body.city;
        }

        if (req.body.state) {
          result.state = req.body.state;
        }

        if (req.body.zip) {
          result.zip = req.body.zip;
        }

        console.log('new result = ', result);
        return result.save();
      })
      .then(result => {
        console.log('final result = ', result);
        if(result) {
            return res.status(HttpStatus.CREATED).json(result)
        }
        return null;

      })
      .catch(err => {
        req.log.error(err);
        return handleErr(res, HttpStatus.INTERNAL_SERVER_ERROR, err.message);
      })
  }
}

我想计算一个布尔值(id <- rep(1:3, each=3) hospitalization <- seq(ymd_hms("2017-11-28 00:00:01"), by = "day", length.out = length(id)) dat <- data.frame(id, hospitalization) dat[3,2] <- dat[3,2] + dhours(12) library(dplyr) library(lubridate) dat %>% mutate( discharge = hospitalization + dhours(35), interval= hospitalization %--% discharge ) -> dat dat > dat id hospitalization discharge interval 1 1 2017-11-28 00:00:01 2017-11-29 11:00:01 2017-11-28 00:00:01 UTC--2017-11-29 11:00:01 UTC 2 1 2017-11-29 00:00:01 2017-11-30 11:00:01 2017-11-29 00:00:01 UTC--2017-11-30 11:00:01 UTC 3 1 2017-11-30 12:00:01 2017-12-01 23:00:01 2017-11-30 12:00:01 UTC--2017-12-01 23:00:01 UTC 4 2 2017-12-01 00:00:01 2017-12-02 11:00:01 2017-12-01 00:00:01 UTC--2017-12-02 11:00:01 UTC 5 2 2017-12-02 00:00:01 2017-12-03 11:00:01 2017-12-02 00:00:01 UTC--2017-12-03 11:00:01 UTC 6 2 2017-12-03 00:00:01 2017-12-04 11:00:01 2017-12-03 00:00:01 UTC--2017-12-04 11:00:01 UTC 7 3 2017-12-04 00:00:01 2017-12-05 11:00:01 2017-12-04 00:00:01 UTC--2017-12-05 11:00:01 UTC 8 3 2017-12-05 00:00:01 2017-12-06 11:00:01 2017-12-05 00:00:01 UTC--2017-12-06 11:00:01 UTC 9 3 2017-12-06 00:00:01 2017-12-07 11:00:01 2017-12-06 00:00:01 UTC--2017-12-07 11:00:01 UTC dat[1,4] dat[2,4] dat[3,4] int_overlaps(dat[1,4],dat[2,4]) int_overlaps(dat[2,4],dat[3,4]) int_overlaps(dat[1,4],dat[3,4]) int_overlaps(dat[1,4],dat[3,4]) )的列,该列指示一个时间段是否与同一组中的另一个(除至少一个以外,其他所有时间都重叠)。

按ID分组时,对于overlap_any,第一个和第二个句段重叠,但第三个和第三个句段不重叠。因此,该ID id==1应该为overlap_any

我在想类似的东西:

(True,True,False)

但是我不知道该怎么做,因为dat %>% group_by(id) %>% mutate( overlap_any = some_function(interval) ) 占用了一个组的所有间隔,而不是我想评估的与其余行重叠的当前行。此外,group_by仅接受两个参数。

感谢您的帮助!

1 个答案:

答案 0 :(得分:0)

我做到了

overlaps_others <- function(y) sapply(y, function(x) sum(int_overlaps(x,y)))-1

dat %>% 
  split(id) %>% 
  lapply(function(z){
    z %>% 
      mutate(
        overlaps = overlaps_others(interval)

      ) %>%
      select(-interval)
  }) %>% 
  bind_rows()