我有一个如下所示的每日数据集:
date CMA0013 CMA0047 CMA0052 CMA0067
1975-10-01 0 0.012 0.078 0
1975-10-02 0 0.012 0.078 0
1975-10-03 0 0.012 0.078 0
1975-10-04 0 0.012 0.078 0
1975-10-05 0 0.012 0.078 0
1975-10-06 0 0.012 0.078 0
...
在R中,我想按月和年计算(汇总)每列中有多少记录满足条件< 0.001
。让我们说得到类似的东西:
month year CMA0013 CMA0047 CMA0052 CMA0067
10 1975 6 0 0 6
11 1975 ...
我尝试过使用aggregate
和ddply
函数的不同选项,但是,由于我对它们的了解还不是很深,我无法得到任何令人满意的解决方案。感谢所有人提供的任何帮助
不适用于ddply
df$year <- year(df$date)
df$month <- month(df$date)
df2 <- ddply(df,~year+month,summarise,
count = length(df[,df$CMA0010 < 0.001]))
它没有正确地进行求和,并且只对一列(CMA0010)
进行答案 0 :(得分:1)
这是一种方式......
library(lubridate) #to extract the year and month
df$year <- year(df$date)
df$month <- month(df$date)
df2 <- aggregate(df[, grep("CMA", names(df))], #just summarise columns starting "CMA"
by = list(year=df$year, month=df$month),
function(x) sum(x<0.001))
df2
year month CMA0013 CMA0047 CMA0052 CMA0067
1 1975 10 6 0 0 6
答案 1 :(得分:0)
尝试使用带有dplyr:
的lubridate包 sum_df <- daily %>%
mutate(month = lubridate::month(date),
year= lubridate::year(date)) %>%
group_by(year, month) %>%
summarise(CMA0013 = sum(CMA0013 < 0.001),
#The rest of you sums...
)
答案 2 :(得分:0)
dplyr
和lubridate
解决方案,但会自动计算所有CMA
列的总和。
library(dplyr)
library(lubridate)
library(tidyr)
d %>%
gather(key, value, -date) %>%
mutate(year = year(date), month = month(date)) %>%
select(-date) %>%
group_by(year, month, key) %>%
summarize(N = sum(value < 0.001)) %>%
spread(key, N)
# A tibble: 1 x 6
# Groups: year, month [1]
year month CMA0013 CMA0047 CMA0052 CMA0067
* <dbl> <dbl> <int> <int> <int> <int>
1 1975 10 6 0 0 6