R按日期分组以获得频率并使用另一列进行过滤

时间:2019-01-21 20:23:55

标签: r plyr

我有以下R数据帧。我想按日期获取频率,但是使用$HOME\terraform.d\plugins\terraform-provider-ibm_v0.14.1.exe 列将频率保持为0(如果为0)。该如何处理?

下面是我的数据框:

SELECT
  item_id,
  MAX(IF(property_name = 'color', value, NULL)) AS color,
  MAX(IF(property_name = 'size', value, NULL)) AS size,
  MAX(IF(property_name = 'weight', value, NULL)) AS weight
FROM
  properties
GROUP BY
  item_id

如果我执行以下Min,我会得到

library(plyr)

df
  Location   Date            Min six endsix seven seventeen starteighteen eighteen

1 location_1 2018-11-21       0 360    415   420      1020          1025     1080
2 location_1 2018-11-22       0 360    415   420      1020          1025     1080
3 location_1 2018-11-23     131 360    415   420      1020          1025     1080
4 location_1 2018-11-24       0 360    415   420      1020          1025     1080
5 location_1 2018-11-25    1001 360    415   420      1020          1025     1080
6 location_1 2018-11-25     272 360    415   420      1020          1025     1080
7 location_1 2018-11-25    1319 360    415   420      1020          1025     1080

我想这样做,但是如果count(location_1, "Date")列的值为0,则频率为0,如下所示:

   Date          freq
1  2018-11-21    1
2  2018-11-22    1
3  2018-11-23    1
4  2018-11-24    1
5  2018-11-25    5

1 个答案:

答案 0 :(得分:3)

使用let merged = _.cloneDeep(objects.shift()); // clone to keep source untouched objects.forEach((obj) => { _.eachDeep(obj, (value, key, parent, ctx) => { if (_.isObject(value)) return; let exists = _.get(merged, ctx.path); if (exists == undefined) { exists = value; } else { exists = _.uniq([].concat(exists, value)); if (exists.length == 1) exists = exists[0]; } _.set(merged, ctx.path, exists); }); });

data.table

结果:

# set seed for reproducibility
set.seed(1)

# data frame
df <- data.frame(Date = sample(seq(as.Date("2019-01-01"), as.Date("2019-01-09"), by = "days"), 30, replace = T), 
           Min = sample(c(0:5), 30, replace = T), stringsAsFactors = F)

# load packages
library(magrittr)
library(data.table)

# make df into data.table
setDT(df)

# establish which Date values have Min = 0
minVals <- df[Min == 0, unique(Date)]

# Count date and set those rows with Date Min = 0 to 0
res <- df[, .N, by = 'Date'][
  Date %in% minVals, N := 0
  ]

如果您以一种我们可以在尝试提供答案时实际测试的方式发布数据片段,那就太好了。尝试> res Date N 1: 2019-01-03 0 2: 2019-01-04 0 3: 2019-01-06 0 4: 2019-01-09 5 5: 2019-01-02 5 6: 2019-01-01 2 7: 2019-01-07 0 8: 2019-01-05 1 9: 2019-01-08 1 dput(head(df, 10))将在控制台上显示输出,该输出应该是一段代码,用于构建您的实际数据片段。

一种R解决方案:

dplyr

结果:

library(dplyr)

count(df, Date) %>% 
  mutate(n = ifelse(Date %in% pull(filter(df, Min == 0), Date), 0, n))