我有一个这样的数据集:
sum_col city scen model time_period chill_season
110.02 NY RCP_8 bcc 2076_2099 season_2085_2086
91.26 NY RCP_8 bcc 2076_2099 season_2086_2087
91.05 NY RCP_8 bcc 2076_2099 season_2087_2088
74.96 NY RCP_8 bcc 2076_2099 season_2088_2089
77.97 NY RCP_8 bcc 2076_2099 season_2089_2090
109.05 NY RCP_8 bcc 2076_2099 season_2090_2091
我想cut
sum_col
列并计算多少次,这些值下降
在每个间隔bks = c(-300, seq(20, 75, 5), 300)
之内。
但是,当我尝试以下操作时:
result <- dt %>%
mutate(thresh_range = cut(sum_col, breaks = bks)) %>%
group_by(time_period, thresh_range, model, scen, city) %>%
summarize(no_years = n_distinct(chill_season, na.rm = FALSE)) %>%
data.table()
我的结果如下:
time_period thresh_range model scen city no_years
2076_2099 (70,75] bcc RCP_8 NY 1
2076_2099 (75,300] bcc RCP_8 NY 5
因此,间隔小于70
,例如(20, 25), (25, 30)
是
未创建(因为在该时间间隔内没有数据行)。
反正有没有告诉cut
在这些间隔内返回零?
再次请注意,该行类似于以下内容:
a_value_leass_than_70_here NY RCP_8 bcc 2076_2099 chill_2076_2077
其对应的sum_col
小于70的数据不存在,但是,我想知道对于这样不存在的数据是否有可能,cut
可以创建一个{{1} }或0
告诉我们纽约的温度,而这些参数的确不在NA
区间内。
最重要的是,我想知道多少年,每个具有给定参数(20, 25)
的城市都落在每个间隔(model, scen, etc)
内,
如果还有其他建议(20, 25), (25,30), etc.
有效,那也很好。
答案 0 :(得分:2)
您可以使用complete
包中的tidyr
函数为丢失的数据组合创建NA
行:
library(tidyr)
result <- dt %>%
mutate(thresh_range = cut(sum_col, breaks = bks)) %>%
complete(time_period, thresh_range, model, scen, city) %>%
group_by(time_period, thresh_range, model, scen, city) %>%
summarize(no_years = n_distinct(chill_season, na.rm = TRUE))
result
# # A tibble: 13 x 6
# # Groups: time_period, thresh_range, model, scen [?]
# time_period thresh_range model scen city no_years
# <chr> <fct> <chr> <chr> <chr> <int>
# 1 2076_2099 (-300,20] bcc RCP_8 NY 0
# 2 2076_2099 (20,25] bcc RCP_8 NY 0
# 3 2076_2099 (25,30] bcc RCP_8 NY 0
# 4 2076_2099 (30,35] bcc RCP_8 NY 0
# 5 2076_2099 (35,40] bcc RCP_8 NY 0
# 6 2076_2099 (40,45] bcc RCP_8 NY 0
# 7 2076_2099 (45,50] bcc RCP_8 NY 0
# 8 2076_2099 (50,55] bcc RCP_8 NY 0
# 9 2076_2099 (55,60] bcc RCP_8 NY 0
# 10 2076_2099 (60,65] bcc RCP_8 NY 0
# 11 2076_2099 (65,70] bcc RCP_8 NY 0
# 12 2076_2099 (70,75] bcc RCP_8 NY 1
# 13 2076_2099 (75,300] bcc RCP_8 NY 5