我有一个数据样本 (比如说1.9 4.8 3.9 4.7 2.3 4.6 3.9)
我想在Bell Curve上分发数据,并给出1到5之间的等级。如何使用统计数据。
(前20%将被评为5,然后是4,依此类推)
答案 0 :(得分:3)
您的问题的第一部分在这里:
x <- c(1.9,4.8,3.9,4.7,2.3,4.6,3.9)
sigma <- sd(x) # 1.175747
mu <- mean(x) # 3.728571
curve((1/(sigma*sqrt(2*pi)))*exp(-((x-mu)^2)/(2*sigma^2)),xlim=c(-1,9),ylab="density")
y <- (1/(sigma*sqrt(2*pi)))*exp(-((x-mu)^2)/(2*sigma^2))
points(x, y, col="red")
第二部分
这可能是一种更简单的方法,但这样做有:
p.quint <- qnorm(p = c(0, .2, .4, .6, .8, 1), mean = mu, sd = sigma)
names(p.quint) <- c(1:5, NA)
p.quint
# 1 2 3 4 5 <NA>
# -Inf 2.739038 3.430699 4.026444 4.718105 Inf
# Check how many items in p.quint are lower than p and use this as
# the index to p.quint's names and store it in x.quint
x.quint <- unlist(lapply(x, function(a) as.numeric(names(p.quint))[sum(a > p.quint)]))
cbind(x, x.quint)
# x x.quint
# [1,] 1.9 1
# [2,] 4.8 5
# [3,] 3.9 3
# [4,] 4.7 4
# [5,] 2.3 1
# [6,] 4.6 4
# [7,] 3.9 3
第二部分的上一个答案
[这是在OP之前提到所需的输出将代表五分之一]
好的,我明白你的意思了。所以,让我们这样做:
x <- c(1.9,4.8,3.9,4.7,2.3,4.6,3.9)
# sort x to simplify matters
x <- sort(x)
# standardize x around its mean
x.tr <- x - mean(x)
# Check range ; we want it to be 4 (5-1)
range(x.tr)[2] - range(x.tr)[1] # 2.9
# Apply transformation to stretch the data a little bit
x.tr <- x * 4/2.9
range(x.tr)[2] - range(x.tr)[1]
# [1] 4
# We also want the min to be 1
x.tr <- x.tr - (x.tr[1]-1)
mu <- mean(x.tr) # 3.522167
sigma <- sd(x.tr) # 1.62172
x <- x.tr
curve((1/(sigma*sqrt(2*pi)))*exp(-((x-mu)^2)/(2*sigma^2)),xlim=c(-1,9),ylab="density")
y.tr <- (1/(sigma*sqrt(2*pi)))*exp(-((x.tr-mu)^2)/(2*sigma^2))
points(x.tr, y.tr, col="blue")
现在,您可以使用以下参数在正态分布上获得1到5的分数:
mu
# [1] 3.522167
sigma
# [1] 1.62172