在dplyr中使用cut

时间:2015-12-26 20:25:56

标签: r dplyr

我尝试将for循环重新编码为使用dplyr的{​​{1}}代码。我得到的错误是:

  

错误:'休息'不是唯一的

如何在cut中使cut循环中的for函数相同?

dput:

dplyr

df <- structure(list(fyear = c(1970, 1970, 1970, 1970, 1970, 1970, 1970, 1970, 1970, 1970, 1970, 1970, 1970, 1970, 1970, 1970, 1970, 1970, 1970, 1970), BEME = c(0.39713747645951, 0.548988782444936, 0.537154930871343, 1.89357008340059, 1.66945262543448, 0.969181836638018, 1.09989952916609, 0.858308443214104, 0.292175536881419, 0.684685677549708, 0.338422675433708, 3.02671555788371, 0.422643864469658, 0.805317430736738, 0.529954031556715, 0.617716486520065, 0.911576593365635, 0.4131850675139, 1.16211278792693, 2.13177678851802), exchg = c(11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 12L, 12L, 12L, 11L, 11L, 12L, 11L, 12L, 19L, 11L, 11L, 11L)), .Names = c("fyear", "BEME", "exchg"), class = c("tbl_df", "data.frame"), row.names = c(NA, -20L)) 循环:

for

for (i in 1:length(fiscalyear)) { df$LMH[which(df$fyear==fiscalyear[i])] = cut(df$BEME[which(df$fyear==fiscalyear[i])], breaks=quantile(df$BEME[which(df$fyear==fiscalyear[i] & df$exchg==11)], c(0,0.3,0.7,1)), labels=F) } > head(df) Source: local data frame [6 x 4] fyear BEME exchg LMH (dbl) (dbl) (int) (int) 1 1970 0.3971375 11 NA 2 1970 0.5489888 11 2 3 1970 0.5371549 11 2 4 1970 1.8935701 11 3 5 1970 1.6694526 11 3 6 1970 0.9691818 11 2 代码:

dplyr

1 个答案:

答案 0 :(得分:3)

对于dplyr代码,我认为你想要替换

quantile(df$BEME & df$exchg, c(0,0.3,0.7,1))

quantile(BEME, c(0,0.3,0.7,1))

最终代码:

newdat <- df %>% 
          group_by(fyear) %>% 
          filter(exchg == 11) %>% 
          mutate(LMH = cut(BEME, breaks = quantile(BEME, c(0,0.3,0.7,1)), labels = FALSE))