我无法解决以下有关(通过限制列数限制)数据框“注释”的问题。
require(irr)
# data
annotations <- read.table(text = "Obj1 Obj2 Obj3
Rater1 a b c
Rater2 a b b
Rater3 a b c", header = TRUE, stringsAsFactors = FALSE)
我想将irr包中的函数同意应用于行的所有组合(而不是排列),从而产生以下结果。
Agreement rater 1-2: 67%
Agreement rater 1-3: 100%
Agreement rater 2-3: 67%
我需要在所有行组合上运行一个函数,函数需要访问多个/所有列。
我已经找到了问题的部分答案;我已经生成了一个运行combn(rownames(annotations), 2)
的组合列表,但是我没有看到如何使用这个列表而没有编写低效的for循环。
我尝试过应用,就像apply(annotations, 1, agree)
一样,但我只能在一行上工作,而不是之前提到的组合。
有没有人知道如何继续?
更新:根据您的建议,以下解决方案有效。 (我使用了irr包中的kappa2
而不是agree
,但主要问题的解决方案保持不变。)
require(irr) #require the irr library for agreement calculations
annotations <- read.table(text = "Obj1 Obj2 Obj3
Rater1 a b c
Rater2 a b b
Rater3 a b c
Rater4 c a a", header = TRUE, stringsAsFactors = FALSE)
annotations <- t(annotations) #transpose annotations (rows become columns and vice versa)
kappa_list <- combn(colnames(annotations), 2, FUN=function(x) kappa_list[[length(kappa_list)+1]] = kappa2(matrix(c(annotations[,x[1]], annotations[,x[2]]), ncol=2))$value) #fill kappa_list with all pairs of columns (combinations of 2 raters) in annotations and, per combination, add a value to kappa_list that consists of the value of kappa2 applied to the current combination of raters
kappa_list # display the list of values
答案 0 :(得分:3)
即将结束,您只需要apply
的结果combn
。我不知道你指的是什么功能,但如果你插入你的功能,这应该是一样的。
首先,将结果保存为列表,因为更容易添加名称(我将两个条目组合在一起):
toCheck <- combn(rownames(annotations), 2, simplify = FALSE)
names(toCheck) <-
sapply(toCheck, paste, collapse = " - ")
然后,使用sapply
来处理您的组合。在这里,我使用mean
进行比较,但在此处使用您需要的内容。如果您返回的值不止一个,请使用lapply
,然后根据需要使用结果进行打印
sapply(toCheck, function(x){
mean(annotations[x[1], ] == annotations[x[2], ])
})
返回:
Rater 1 - Rater 2 Rater 1 - Rater 3 Rater 2 - Rater 3
0.6666667 1.0000000 0.6666667
答案 1 :(得分:0)
将函数f(x):= 2x+5
应用于与组合对应的列的所有条目。可以编写他/她自己的函数来代替f(x):= 2x+5
:
第1步:设计特定的数据帧组合。 (以下是我自己的情况)
causalitycombinations <- function (nvars, ncausers, ndependents)
{
independents <- combn(nvars, ncausers)
swingnumber <- dim(combn(nvars - ncausers, ndependents))[[2]]
numberofallcombinations <- dim(combn(nvars, ncausers))[[2]] * swingnumber
dependents <- matrix(, nrow = dim(combn(nvars, ncausers))[[2]] * swingnumber, ncol = ndependents)
for (i in as.integer(1:dim(combn(nvars, ncausers))[[2]])) {
dependents[(swingnumber * (i - 1) + 1):(swingnumber * i), ] <- t(combn(setdiff(seq(1:nvars), independents[, i]), ndependents))
}
swingedindependents <- matrix(, nrow = dim(combn(nvars, ncausers))[[2]] * swingnumber, ncol = ncausers)
for (i in as.integer(1:dim(combn(nvars, ncausers))[[2]])) {
for (j in as.integer(1:swingnumber)) {
swingedindependents[(i - 1) * swingnumber + j, ] <- independents[, i]
}
}
independentsdependents <- cbind(swingedindependents, dependents)
others <- matrix(, nrow = dim(combn(nvars, ncausers))[[2]] * swingnumber, ncol = nvars - ncausers - ndependents)
for (i in as.integer(1:((dim(combn(nvars, ncausers))[[2]]) * swingnumber))) {
others[i, ] <- setdiff(seq(1:nvars), independentsdependents[i, ])
}
causalitiestemplate <- cbind(independentsdependents, others)
causalitiestemplate
}
causalitycombinations(3,1,1)
# [,1] [,2] [,3]
#[1,] 1 2 3
#[2,] 1 3 2
#[3,] 2 1 3
#[4,] 2 3 1
#[5,] 3 1 2
#[6,] 3 2 1
Step2:将数据附加到组合
中
(可以附加多个列,为简单起见,我只添加了1个)
set.seed(1)
mydataframer <- cbind(causalitycombinations(3,1,1), rnorm(6))
mydataframer
# [,1] [,2] [,3] [,4]
#[1,] 1 2 3 -0.6264538
#[2,] 1 3 2 0.1836433
#[3,] 2 1 3 -0.8356286
#[4,] 2 3 1 1.5952808
#[5,] 3 1 2 0.3295078
#[6,] 3 2 1 -0.8204684
第3步:通过lapply
应用该功能,同时考虑复合数据帧的行数
lapply(1: dim(mydataframer)[[1]], function(x) {2*mydataframer[x,4] + 5})
# 3.747092
# 5.367287
# 3.328743
# 8.190562
# 5.659016
# 3.359063
就是这样。
顺便说一下,?irr::agree
帮助文件指出nxm
评级矩阵/数据框是&#34; n科目,评分者&#34;。因此,提问者可以通过以下方式更好地设计:
annotations <- read.table(text = "Rater1 Rater2 Rater3
Subject1 a b c
Subject2 a b b
Subject3 a b c", header = TRUE, stringsAsFactors = FALSE)
annotations
# Rater1 Rater2 Rater3
# Subject1 a b c
# Subject2 a b b
# Subject3 a b c
此外,还有一点需要澄清,提问者是否想要遍历所有这些注释组合。如果是这种情况,即
annotations
# Rater1 Rater2 Rater3
# Subject1 a a a
# Subject2 a a a
# Subject3 a a a
annotations
# Rater1 Rater2 Rater3
# Subject1 a a b
# Subject2 a a a
# Subject3 a a a
annotations
# Rater1 Rater2 Rater3
# Subject1 a a c
# Subject2 a a a
# Subject3 a a a
annotations
# Rater1 Rater2 Rater3
# Subject1 a b a
# Subject2 a a a
# Subject3 a a a
# .... after consuming all Subject1 possibilities, this time consuming Subject2 possibilities,
annotations
# Rater1 Rater2 Rater3
# Subject1 a a a
# Subject2 a a b
# Subject3 a a a
然后是Subject3的可能性,从而收集协议的所有可能性,然后问题完全改变。
为 多个 行设计的irr::agree
功能。从其帮助文件中观察:
data(video)
video
# rater1 rater2 rater3 rater4
# 1 4 4 3 4
# 2 4 4 4 5
# ..............................
# 20 4 5 5 4
agree(video) # Simple percentage agreement
# Percentage agreement (Tolerance=0)
# Subjects = 20; Raters = 4; %-agree = 35
agree(video, 1) # Extended percentage agreement
# Percentage agreement (Tolerance=1)
# Subjects = 20; Raters = 4; %-agree = 90
如果提问者想要行 同意(只有1个主题!),% - 同意总是0 :
agree(video[1,])
# Percentage agreement (Tolerance=0)
# Subjects = 1; Raters = 4; %-agree = 0
...
agree(video[20,])
# Percentage agreement (Tolerance=0)
# Subjects = 1; Raters = 4; %-agree = 0