将多列值转换为R中的分位数

时间:2017-12-20 22:39:15

标签: r

我有一张桌子。

Table:

Year    T_R         Nov_SMwk1    Nov_SMwk2    Nov_SMwk3
1998    T10S-R10W   30.53        35.82        28.35
1998    T10S-R11E   24.52        23.53        30.85
1998    T10S-R12E   20.52        22.56        21.36
1999    T10S-R10W   31.53        31.82        18.35
1999    T10S-R11E   23.42        21.43        10.45
1999    T10S-R12E   21.22        20.06        31.26
2000    T10S-R10W   41.53        41.82        28.35
2000    T10S-R11E   13.42        21.43        20.45
2000    T10S-R12E   31.12        25.06        36.26
2001    T10S-R10W   25.43        10.82        25.35
2001    T10S-R11E   33.40        22.43        26.45
2001    T10S-R12E   21.12        28.06        30.26

我需要为每年的每个列值获取0到100的分位数。我怎样才能在R中实现这个目标。

提前致谢。

1 个答案:

答案 0 :(得分:1)

查看?ecdf的分位数值估算。

您的数据

data <- read.table(text="
Year    T_R         Nov_SMwk1    Nov_SMwk2    Nov_SMwk3
1998    T10S-R10W   30.53        35.82        28.35
1998    T10S-R11E   24.52        23.53        30.85
1998    T10S-R12E   20.52        22.56        21.36
1999    T10S-R10W   31.53        31.82        18.35
1999    T10S-R11E   23.42        21.43        10.45
1999    T10S-R12E   21.22        20.06        31.26
2000    T10S-R10W   41.53        41.82        28.35
2000    T10S-R11E   13.42        21.43        20.45
2000    T10S-R12E   31.12        25.06        36.26
2001    T10S-R10W   25.43        10.82        25.35
2001    T10S-R11E   33.40        22.43        26.45
2001    T10S-R12E   21.12        28.06        30.26", header=T, stringsAsFactors=F)

你可以做到

library(dplyr)
data %>% 
  group_by(Year) %>% 
  mutate(Nov_SMwk1_quantile  = ecdf(Nov_SMwk1)(Nov_SMwk1)) %>% 
  mutate(Nov_SMwk2_quantile  = ecdf(Nov_SMwk2)(Nov_SMwk2)) %>% 
  mutate(Nov_SMwk3_quantile  = ecdf(Nov_SMwk3)(Nov_SMwk3))

mutate_at()

per_fun <- function(x){ecdf(x)(x)}
data %>% 
  group_by(Year) %>% 
  mutate_at(vars(Nov_SMwk1:Nov_SMwk3), .funs=per_fun)

返回:

# A tibble: 12 x 5
# Groups:   Year [4]
#    Year       T_R Nov_SMwk1 Nov_SMwk2 Nov_SMwk3
#   <int>     <chr>     <dbl>     <dbl>     <dbl>
# 1  1998 T10S-R10W 1.0000000 1.0000000 0.6666667
# 2  1998 T10S-R11E 0.6666667 0.6666667 1.0000000
# 3  1998 T10S-R12E 0.3333333 0.3333333 0.3333333
# 4  1999 T10S-R10W 1.0000000 1.0000000 0.6666667
# 5  1999 T10S-R11E 0.6666667 0.6666667 0.3333333
# 6  1999 T10S-R12E 0.3333333 0.3333333 1.0000000
# 7  2000 T10S-R10W 1.0000000 1.0000000 0.6666667
# 8  2000 T10S-R11E 0.3333333 0.3333333 0.3333333
# 9  2000 T10S-R12E 0.6666667 0.6666667 1.0000000
#10  2001 T10S-R10W 0.6666667 0.3333333 0.3333333
#11  2001 T10S-R11E 1.0000000 0.6666667 0.6666667
#12  2001 T10S-R12E 0.3333333 1.0000000 1.0000000