使用百分位和相应值绘制R中的分布

时间:2018-06-27 13:12:50

标签: r shiny distribution

我想绘制这里找到的SAT数据集

https://blog.prepscholar.com/sat-percentiles-high-precision-2016

我这样做的问题是我不确定是否有比我现在更优雅的方法。目前,我正在将这些百分位数转换为Z分数,然后使用该关联概率创建分布。我的问题是

  1. SAT分数似乎不是正态分布的。它们近似于正态分布,但似乎不是正态分布。这使得无法使用经过尝试的真实rmrm(X,平均值= 1000,sd = 200)。
  2. 我不确定是否有比将百分位数转换为Z分数更好的方法,因为SAT分数似乎不是正态分布的。

任何帮助将不胜感激。

我的长期计划是创建一个闪亮的ap,一个人可以使用滑块沿分布移动,以查看他们的分数将其置于百分之几的位置。但是第一步是实际找出产生精确分布的方法。

这是总共的代码:

    scores <- as.numeric(
    c(
1600, 1593, 1587, 1580, 1573, 1567,
1560, 1553, 1547, 1540, 1533, 1527,
1520, 1513, 1507, 1500, 1493, 1487,
1480, 1473, 1467, 1460, 1453, 1447,
1440, 1433, 1427, 1420, 1413, 1407,
1400, 1393, 1387, 1380, 1373, 1367,
1360, 1353, 1347, 1340, 1333, 1327, 
1320, 1313, 1307, 1300, 1293, 1287, 
1280, 1273, 1267, 1260, 1253, 1247, 
1240, 1233, 1227, 1220, 1213, 1207, 
1200, 1193, 1187, 1180, 1173, 1167, 
1160, 1153, 1147, 1140, 1133, 1127, 
1120, 1113, 1107, 1100, 1093, 1087, 
1080, 1073, 1067, 1060, 1053, 1047, 
1040, 1033, 1027, 1020, 1013, 1007, 
1000, 0993, 0987, 0980, 0973, 0967, 
0960, 0953, 0947, 0940, 0933, 0927,
0920, 0913, 0907, 0900, 0893, 0887,
0880, 0873, 0867, 0860, 0853, 0847,
0840, 0833, 0827, 0820, 0813, 0807, 
0800, 0793, 0787, 0780, 0773, 0767, 
0760, 0753, 0747, 0740, 0733, 0727,
0720, 0713, 0707, 0700, 0693, 0687,
0680, 0673, 0667, 0660, 0653, 0647,
0640, 0633, 0627, 0620, 0613, 0607,
0600, 0593, 0587, 0580, 0573, 0567,
0560, 0553, 0547, 0540, 0533, 0527,
0520, 0513, 0507, 0500, 0493, 0487,
0480, 0473, 0467, 0460, 0453, 0447, 
0440, 0433, 0427, 0420, 0413, 0407, 
0400
    )
)

probs <- as.numeric(
    c(
0.000665335, 0.001508711, 0.002067932, 0.002878073, 0.003976493, 0.005137125,
0.006483906, 0.008169371, 0.010066473, 0.012051019, 0.014095858,
0.016307247, 0.01851226, 0.020808986, 0.023364025, 0.02601564,
0.028721866, 0.031560739, 0.034551519, 0.037733663, 0.041036965,
0.044427406, 0.047928757, 0.051538726, 0.055292787, 0.059145645,
0.063187376, 0.067367682, 0.071691308, 0.076255174, 0.080943422,
0.085690142, 0.090464868, 0.095379534, 0.100511283, 0.105758358,
0.111083521, 0.116571178, 0.122285078, 0.128123463, 0.13397113,
0.139830118, 0.145699225, 0.151666124, 0.157793404, 0.164023115,
0.170358704, 0.17676173, 0.183278673, 0.189955902, 0.196651135,
0.203363202, 0.210071134, 0.216801722, 0.223562248, 0.230347853,
0.237183732, 0.243934459, 0.250657665, 0.257436698, 0.264227682,
0.270968969, 0.277599004, 0.284165141, 0.290648753, 0.297040829,
0.303401865, 0.30964291, 0.315714288, 0.321696268, 0.3275473,
0.33324613, 0.338771004, 0.344104024, 0.34924206, 0.354185931,
0.358956451, 0.363504804, 0.36782396, 0.371962147, 0.375851065,
0.37943538, 0.382754048, 0.38580143, 0.388568593, 0.391025026,
0.393142417, 0.394947023, 0.39643261, 0.397582763, 0.398388967,
0.398835641, 0.398933788, 0.398676885, 0.398065535, 0.397089484,
0.395745285, 0.394041326, 0.391951596, 0.389478503, 0.386655445,
0.383491062, 0.379943329, 0.376014793, 0.371758542, 0.367167906,
0.362242796, 0.356970877, 0.351398549, 0.345499578, 0.33929925,
0.332851353, 0.326122779, 0.31911459, 0.311858166, 0.304408822,
0.296767961, 0.288895933, 0.280893084, 0.272865458, 0.264642129,
0.256259902, 0.247931807, 0.239607, 0.231228792, 0.222875786,
0.214544714, 0.206273084, 0.198003801, 0.189731661, 0.181671447,
0.173826196, 0.16610093, 0.158499433, 0.151067478, 0.143751108,
0.136558209, 0.129683746, 0.123040703, 0.11660098, 0.110455988,
0.104559737, 0.098857524, 0.093254931, 0.087923914, 0.08286171, 
0.077907056, 0.073124914, 0.068583012, 0.064242767, 0.059980362,
0.055906234, 0.052062062, 0.048298984, 0.044678669, 0.041258641,
0.03796372, 0.034841037, 0.03185105, 0.028992313, 0.026330747, 
0.023870486, 0.0215867, 0.019418127, 0.017379035, 0.015436224,
0.01358768, 0.011905384, 0.01032104, 0.008865584, 0.007564083,
0.006380472, 0.005299104, 0.004317991, 0.003512008, 0.00284036,
0.002268168, 0.001724469, 0.001327348, 0.001030367, 0.000112
    )
)

data <- as.data.frame (cbind (scores, probs))

data2 <- sample (data$scores, 10000000, replace = T, prob = data$probs)

den <- density (data2, adjust = 2)

plot (den , xlim = c(400,1600))

0 个答案:

没有答案