Question

我有一个数据框X，其中包含2列a和b，a是类字符，b是类数字。我在b。

上使用fitdist（fitdistrplus包）函数拟合了高斯分布

data.fit <- fitdist(x$b,"norm", "mle")

我想提取落在拟合高斯分布的5％右尾的列a中的元素我不知道如何进行，因为我对拟合分布的了解有限我是否需要在列a中保留相应的元素，其中b大于95％的值？或者拟合是否暗示为b中的每个值创建了新值，我应该使用这些值？

由于

Answer 1

通过调用unclass(data.fit)，您可以看到构成data.fit对象的所有部分，其中包括：

$estimate
     mean        sd 
0.1125554 1.2724377

表示您可以通过以下方式访问估计的平均值和标准差：

data.fit$estimate['sd']
data.fit$estimate['mean']

要计算拟合分布的第5个百分点，您可以使用qnorm()函数（q代表分位数，BTW），如下所示：

threshold <- 
    qnorm(p = 0.95,
          mean=data.fit$estimate['mean'],
          sd=data.fit$estimate['sd'])

您可以像这样对您的data.frame x进行分组：

x[x$b > threshold,# an indicator of the rows to return
  'a']# the column to return