Question

我正在尝试模拟R中的大量分布，因为我正在为不同的分布参数生成分位数。我想创建一个包含这些参数的大量组合的数据集。例如（使用正态分布）：

df<-data.frame(matrix(ncol=104,nrow=2))
colnames(df)<-c(as.character(seq(0,1,0.01),"type","mean","sd"))

这给了我一个101列的数据帧，分数为0到1的分位数为0.01步，另外三列为“type”，“mean”和“sd”（这是使用正态分布时的唯一参数）。

现在让我们为正态分布族中的两个成员生成分位数：

qnorm.0.1<-qnorm(seq(0,1,0.01),0,1) #normal distribution / mean=0 /sd=1
qnorm.0.2<-qnorm(seq(0,1,0.01),0,2) #normal distribution / mean=0 /sd=2

现在我可以用两个向量填充我的数据帧：

df[1,]<-c(qnorm.0.1,"normal","0","1")
df[2,]<-c(qnorm.0.2,"normal","0","2")

这给了我需要的格式。但是，当我尝试创建具有许多参数组合的大型数据集时（例如，每个组合的平均值从1到10000，sd从1到10000），我将不得不想出一种自动化这个过程的方法。任何帮助表示赞赏。

谢谢！

Answer 1

也许这可以提供帮助，

library(data.table)
## Generate Parameters
param <- 0:9
## Generate Combinatios of par.
cb <- combn(par,2, simplify = F)
n <- length(cb)
## Input the parameters
DT <- lapply(cb, function(x){data.table(rbind(qnorm(seq(0, 1, 0.01),x)))})
DT <- rbindlist(DT)
DT[, `:=`(type=rep("normal",n),
          mean = unlist(cb)[seq(1, n*2, 2) ],
          sd = unlist(cb)[seq(2, n*2, 2) ])]
## Change names
setnames(DT, c(paste0("qnorm", seq(0, 1, 0.01)), "type", "mean", "sd"))
dim(DT)
[1]  45 104

head(DT[,95:104])
   qnorm0.94 qnorm0.95 qnorm0.96 qnorm0.97 qnorm0.98 qnorm0.99 qnorm1   type mean sd
1:  1.554774  2.644854  1.750686  2.880794  2.053749  3.326348    Inf normal    0  1
2:  1.554774  3.644854  1.750686  3.880794  2.053749  4.326348    Inf normal    0  2
3:  1.554774  4.644854  1.750686  4.880794  2.053749  5.326348    Inf normal    0  3
4:  1.554774  5.644854  1.750686  5.880794  2.053749  6.326348    Inf normal    0  4
5:  1.554774  6.644854  1.750686  6.880794  2.053749  7.326348    Inf normal    0  5
6:  1.554774  7.644854  1.750686  7.880794  2.053749  8.326348    Inf normal    0  6

当然，您可以增加参数数量或更改分配函数，但结果将类似。

自动使用大量参数组合作为函数的输入？

1 个答案: