如果满足某些条件,我目前使用40行代码来创建和计算新列。我试图提出一种方法将所有这些代码转换为循环或函数来简化我的脚本。
以下是一些示例数据:
set.seed(1)
dat <- data.frame(sc1 = sample(LETTERS[1:6],15,replace=T),
sc1_n = sample (1:100,15),
sc2 = sample(LETTERS[1:6],15,replace=T),
sc2_n = sample (1:100,15),
sc3 = sample(LETTERS[1:6],15,replace=T),
sc3_n = sample (1:100,15),
ec1 = sample(LETTERS[1:6],15,replace=T),
ec1_n = sample (1:100,15),
ec2 = sample(LETTERS[1:6],15,replace=T),
ec2_n = sample (1:100,15),
ec3 = sample(LETTERS[1:6],15,replace=T),
ec3_n = sample (1:100,15),
area = sample (1:100,15))
我遍历sc1(AF,n = 6),sc2(AF,n = 6)和sc3(AF,n = 6)的每个唯一值来计算值,然后将唯一值一起添加以创建另一列,称为A,B,C,D,E或F,后面附加's'表示它是s的值,而不是e,我在sc1,sc2和sc2完成后也会迭代SC3。
以下是我目前用于生成所需列和值的40行代码:
dat <- transform(dat,A1s = (sc1_n * 0.01) * (area) * (sc1 == "A")) #create new column A1s, and calculates a number if sc1=='A'
dat <- transform(dat,A2s = (sc2_n * 0.01) * (area) * (sc2 == "A")) #create new column A2s, and calculates a number if sc2=='A'
dat <- transform(dat,A3s = (sc3_n * 0.01) * (area) * (sc3 == "A")) #same as above, except A3s and where sc3='A'
dat <- transform(dat,As = A1s + A2s + A3s) #I really don't need A1s, A2s, or A3s, except to calculate this column, As
dat <- transform(dat,B1s = (sc1_n * 0.01) * (area) * (sc1 == "B"))
dat <- transform(dat,B2s = (sc2_n * 0.01) * (area) * (sc2 == "B"))
dat <- transform(dat,B3s = (sc3_n * 0.01) * (area) * (sc3 == "B"))
dat <- transform(dat,Bs = B1s + B2s + B3s)
dat <- transform(dat,C1s = (sc1_n * 0.01) * (area) * (sc1 == "C"))
dat <- transform(dat,C2s = (sc2_n * 0.01) * (area) * (sc2 == "C"))
dat <- transform(dat,C3s = (sc3_n * 0.01) * (area) * (sc3 == "C"))
dat <- transform(dat,Cs = C1s + C2s + C3s)
dat <- transform(dat,D1s = (sc1_n * 0.01) * (area) * (sc1 == "D"))
dat <- transform(dat,D2s = (sc2_n * 0.01) * (area) * (sc2 == "D"))
dat <- transform(dat,D3s = (sc3_n * 0.01) * (area) * (sc3 == "D"))
dat <- transform(dat,Ds = D1s + D2s + D3s)
dat <- transform(dat,E1s = (sc1_n * 0.01) * (area) * (sc1 == "E"))
dat <- transform(dat,E2s = (sc2_n * 0.01) * (area) * (sc2 == "E"))
dat <- transform(dat,E3s = (sc3_n * 0.01) * (area) * (sc3 == "E"))
dat <- transform(dat,Es = E1s + E2s + E3s)
dat <- transform(dat,F1s = (sc1_n * 0.01) * (area) * (sc1 == "F"))
dat <- transform(dat,F2s = (sc2_n * 0.01) * (area) * (sc2 == "F"))
dat <- transform(dat,F3s = (sc3_n * 0.01) * (area) * (sc3 == "F"))
dat <- transform(dat,Fs = F1s + F2s + F3s)
dat <- transform(dat,A1e = (ec1_n * 0.01) * (area) * (ec1 == "A"))
dat <- transform(dat,A2e = (ec2_n * 0.01) * (area) * (ec2 == "A"))
dat <- transform(dat,A3e = (ec3_n * 0.01) * (area) * (ec3 == "A"))
dat <- transform(dat,Ae = A1e + A2e + A3e)
dat <- transform(dat,B1e = (ec1_n * 0.01) * (area) * (ec1 == "B"))
dat <- transform(dat,B2e = (ec2_n * 0.01) * (area) * (ec2 == "B"))
dat <- transform(dat,B3e = (ec3_n * 0.01) * (area) * (ec3 == "B"))
dat <- transform(dat,Be = B1e + B2e + B3e)
dat <- transform(dat,C1e = (ec1_n * 0.01) * (area) * (ec1 == "C"))
dat <- transform(dat,C2e = (ec2_n * 0.01) * (area) * (ec2 == "C"))
dat <- transform(dat,C3e = (ec3_n * 0.01) * (area) * (ec3 == "C"))
dat <- transform(dat,Ce = C1e + C2e + C3e)
dat <- transform(dat,D1e = (ec1_n * 0.01) * (area) * (ec1 == "D"))
dat <- transform(dat,D2e = (ec2_n * 0.01) * (area) * (ec2 == "D"))
dat <- transform(dat,D3e = (ec3_n * 0.01) * (area) * (ec3 == "D"))
dat <- transform(dat,De = D1e + D2e + D3e)
dat <- transform(dat,E1e = (ec1_n * 0.01) * (area) * (ec1 == "E"))
dat <- transform(dat,E2e = (ec2_n * 0.01) * (area) * (ec2 == "E"))
dat <- transform(dat,E3e = (ec3_n * 0.01) * (area) * (ec3 == "E"))
dat <- transform(dat,Ee = E1e + E2e + E3e)
dat <- transform(dat,F1e = (ec1_n * 0.01) * (area) * (ec1 == "F"))
dat <- transform(dat,F2e = (ec2_n * 0.01) * (area) * (ec2 == "F"))
dat <- transform(dat,F3e = (ec3_n * 0.01) * (area) * (ec3 == "F"))
dat <- transform(dat,Fe = F1e + F2e + F3e)
我确信必须有一种方法可以通过创建列表和循环或至少一个函数来巧妙而有效地完成这项工作,但我一直在寻找并找不到方法。
-al
答案 0 :(得分:1)
像这样的转变怎么样
for(p in c("s","e")) {
g <- dat[, paste0(p, "c",1:3)]
n <- dat[, paste0(p, "c",1:3,"_n")]
for(x in LETTERS[1:5]) {
dat[, paste0(x,p) ] <- rowSums(n * 0.01 * (g==x) * dat$area)
}
}
在这里,我们循环使用&#34; s&#34;和&#34; e&#34;前缀,我们提取与该前缀相关的列子集。接下来,我们遍历所有组并计算该组的行总和。在这里,我们试图利用存储在列名中的尽可能多的信息。这不会创建您不需要的临时列(A1s,A2s等)