我没有在其他帖子中找到答案,如果他们处理类似的话题,我也不理解答案,因为我对R和编程总体上相对较新。我有以下调查输出X我正在使用(摘录):
A1B1 A1B2 A1B3 A1B4 A2B1 A2B2 A2B3 ...
-0.37014356 1.08841141 -0.126574243 -0.59169360 1.682673457 -0.427706432 -0.76091938 ...
3.03017573 1.39812421 0.243516558 -4.67181650 -0.378640756 2.039940436 -0.40785893 ...
3.50183121 1.51249433 -0.775449944 -4.23887560 -0.456911873 0.431838943 0.91108052 ...
...
我想计算前4个(diff(range(X[i,n:m]))
等于n:m
)的最大范围1:4
的差异,第二个(5:8)
和第三个(9:12)
{ {1}} X的每一行的列,并将结果放入具有i行和3列的第二个矩阵中。
E.g。对于第一行和前四个列,它将是1.08841141+0.59169360=1.68010501.
为此,我创建了一个新矩阵,并试图用值填充它:
newmatrix <- matrix(0,nrow(X),3)
newmatrix[1:nrow(X),1] <- for (i in (1:nrow(X))) {diff(range(X[i,1:4]))}
newmatrix[1:nrow(X),2] <- for (i in (1:nrow(X))) {diff(range(X[i,5:8]))}
newmatrix[1:nrow(X),3] <- for (i in (1:nrow(X))) {diff(range(X[i,9:12]))}
我收到输出错误:
Error in newmatrix[1:nrow(RBetas), 1] <- for (i in (1:nrow(RBetas))) { :
number of items to replace is not a multiple of replacement length
感谢您的帮助!
答案 0 :(得分:0)
假设列块基于前两个字符,即A1
,A2
,我们可以使用substr
将其分成不同的块来提取前两个字符从列名称中使用它作为split
的索引。然后,我们可以使用apply
与range
和diff
来获得结果,也可以使用pmax
和pmin
。
indx <- substr(colnames(df), 1,2)
如果分组不是基于column names
,而是基于位置,那么这也应该有效
indx <- (1:ncol(df)-1)%/%4 +1
res1 <- sapply(split(seq_len(ncol(df)), indx),
function(i) do.call(pmax,df[,i, drop=FALSE])-
do.call(pmin, df[,i, drop=FALSE]))
或者
res2 <- sapply(split(seq_len(ncol(df)), indx),
function(i) apply(df[,i, drop=FALSE], 1,
function(x) diff(range(x))) )
identical(res1, res2)
#[1] TRUE
res1
# A1 A2
#[1,] 1.680105 2.443593
#[2,] 7.701992 2.447799
#[3,] 7.740707 1.367992
或使用您的代码
newmatrix <- matrix(0, nrow(df), 2) #here the example dataset is only 7 columns
for(i in (1:nrow(df))) newmatrix[i,1] <- diff(range(df[i,1:4]))
for(i in (1:nrow(df))) newmatrix[i,2] <- diff(range(df[i,5:7]))
newmatrix
# [,1] [,2]
#[1,] 1.680105 2.443593
#[2,] 7.701992 2.447799
#[3,] 7.740707 1.367992
如果您有多个列块,则可以尝试双for
循环
lst <- split(seq_len(ncol(df)), indx) #keep the columns to group in a `list`
newmatrix <- matrix(0, nrow(df), 2) #he
for(i in 1:nrow(df)){
for(j in seq_along(lst)){
newmatrix[i,j] <- diff(range(df[i, lst[[j]]]))
}
}
newmatrix
# [,1] [,2]
#[1,] 1.680105 2.443593
#[2,] 7.701992 2.447799
#[3,] 7.740707 1.367992
df <- structure(list(A1B1 = c(-0.37014356, 3.03017573, 3.50183121),
A1B2 = c(1.08841141, 1.39812421, 1.51249433), A1B3 = c(-0.126574243,
0.243516558, -0.775449944), A1B4 = c(-0.5916936, -4.6718165,
-4.2388756), A2B1 = c(1.682673457, -0.378640756, -0.456911873
), A2B2 = c(-0.427706432, 2.039940436, 0.431838943), A2B3 = c(-0.76091938,
-0.40785893, 0.91108052)), .Names = c("A1B1", "A1B2", "A1B3",
"A1B4", "A2B1", "A2B2", "A2B3"), class = "data.frame", row.names = c(NA,
-3L))