我在R中使用函数分位数来计算第90,75,50,25百分位,但是我的同事使用SAS proc单变量进行相同的计算,我们得到了截然不同的结果(例如,来自R的第90百分位数)是47.36,但SAS的90%是50.64)。我试图找出原因。有人可以给我一些指导吗?
分位数(c(43.55,41.30,39.40,40.93,38.74,39.97,45.38,41.48,45.01,42.03,44.71,43.42,45.83,43.44,37.84,50.64,53.16,45.95),prob = c(0.90, 0.10,0.75,0.50,0.25))
data x;
input x;
datalines;
43.55
41.30
39.40
40.93
38.74
39.97
45.38
41.48
45.01
42.03
44.71
43.42
45.83
43.44
37.84
50.64
53.16
45.95
;
run;
proc univariate data=x noprint ;
var x;
output out=new p90=p90 p10=p10 q3=p75 median=p50 q1=p25 ;
run;
答案 0 :(得分:1)
R中的默认方法为7,而SAS默认方法可能是 empirical distribution function with averaging 。
如果您在R中使用添加选项type = 1
,您将获得与SAS相同的结果。
quantile(c(43.55,41.30,39.40,40.93,38.74,39.97,45.38,41.48,45.01,
42.03,44.71,43.42,45.83,43.44,37.84,50.64,53.16,45.95),
prob=c(0.90, 0.10, 0.75, 0.50, 0.25),
type = 1)
90% 10% 75% 50% 25%
50.64 38.74 45.38 43.42 40.93