Question

我有两个数据集：m和s。第一个数据集包括变量Frequency，p1，p2和p3。

第二个数据集包括回归类型， mean 和样本大小的值。列名分别为z，mean和samplesize。

我需要在第一个数据集m中添加四列，如下所示：

第一列m$reg1应该是与m$p1对应的s$samplesize值的s$z == 'Regression1'倍。
第二列m$reg2应该是与m$p2对应的s$samplesize值的s$z == 'regression2'倍。
第三列m$reg3应该是与m$p3对应的s$samplesize值的s$z == 'regression3'倍。

我想知道如何编写一个循环函数来计算m数据集中的这四个新列。

了解如何在以下代码中创建数据集：

Frequency<-seq(1,27,1)
p1<-seq(2,28,1)
p2<-seq(10,36,1)
p3<-seq(0,26,1)
m<-data.frame(Frequency,p1,p2,p3)

z<-c('Regression1','Regression2','Regression3','Regression4')
mean<-c(2,28,1,17)
samplesize<-c(10,20,30,40)
s<-data.frame(z,mean,samplesize)

Answer 1

如果我正确理解你的问题，就不需要循环。只是做：

 m$regr1 <- m$p1*s$samplesize[s$z=="Regression1"]
 m$regr2 <- m$p2*s$samplesize[s$z=="Regression2"]
 m$regr3 <- m$p3*s$samplesize[s$z=="Regression3"]

Answer 2

使用我们在此answer中应用的相同原则。首先，定义将子集表的子列或行值的名称，然后执行计算，将值填充到一个新的，类似构造的列中。

# custom function that calculates column values 
add.col <- function(i){
    # name in the s$z that defines the correct row
    reg <- paste0("Regression", i)
    # name of the m column
    p <- paste0("p", i)
    # multiply the named column from m with respective samplesize in s
    return(m[, p] * s$samplesize[s$z == reg])
}

# loop through all indices
for(i in 1:3){
    # create a new column with the compound name and fill it with appropriate values
    m[, paste0("reg", i)] <- add.col(i = i)
}

Answer 3

如果你想做一个for循环，这也可以起作用：

desired_col = c(2,3,4) # this can be any selection

for(i in desired_col) { m[[paste0(i,"reg")]] = m[,i]*s[match(i,desired_col),3] }

梳理两个数据集_Loop函数的信息

3 个答案: