Question

我有多个人在多达四个时间段内进行的测试结果。这是一个示例：

dat <- structure(list(Participant_ID = c("A", "A", "A", "A", "B", "B", 
"B", "B", "C", "C", "C", "C"), phase = structure(c(1L, 2L, 3L, 
4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L), .Label = c("base", "sixmos", 
"twelvemos", "eighteenmos"), class = "factor"), result = c("Negative", 
"Negative", "Negative", "Negative", "Negative", "Positive", "Negative", 
NA, "Positive", "Indeterminate", "Negative", "Negative")), .Names = c("Participant_ID", 
"phase", "result"), row.names = c(1L, 2L, 3L, 4L, 97L, 98L, 99L, 
100L, 9L, 10L, 11L, 12L), class = c("cast_df", "data.frame"))

看起来像：

    Participant_ID       phase        result
1                A        base      Negative
2                A      sixmos      Negative
3                A   twelvemos      Negative
4                A eighteenmos      Negative
97               B        base      Negative
98               B      sixmos      Positive
99               B   twelvemos      Negative
100              B eighteenmos          <NA>
9                C        base      Positive
10               C      sixmos Indeterminate
11               C   twelvemos      Negative
12               C eighteenmos      Negative

我想在每个测试中添加一个标识符，以注意该测试是从先前状态（从负到正）的转换，还原（从正到负）或稳定。问题是，我不只是将基础测试与六个月测试，六个月到十二个月等进行比较 - 在像C这样的情况下，六测试应该标记为稳定或不确定（确切的术语是（更重要的是）12sm测试应该与基础测试进行比较并标记为回归。相反，如果某人有一个“否定”，“不确定”，“否定”的序列，那应该是稳定的。

这是我坚持的后半部分;如果它只是每个参与者的一系列比较，我会没事的，但我很难想到如何优雅地处理这些变量比较对。一如既往，非常感谢您的帮助。

Answer 1

我认为你没有概述在所有可能情况下会发生什么（例如，当序列是“不确定，不确定”时的状态是什么？）但这里有一个想法：将“不确定”案件视为缺失和“使用来自包动物园的na.locf来传播“他们”。（或者更好的是，重新实现它以解决您的情况。）

library(plyr)
at <- at[with(at, order(Participant_ID, phase)),]
at <- ddply(at, "Participant_ID", function(x) {
    ## have to figure out what to do with missing data
    result.fix <- na.locf(car::recode(x$result, "'Negative'=0; 'Positive'=1;'Indeterminate'=NA;NA=1000"))
    x$status <- NA
    x$status[-1] <- result.fix[-1]-result.fix[-length(result.fix)]
    x$status <- car::recode(x$status, "-1='reversion'; 1='conversion'; 0='stable'; else=NA")
    x$status[x$result=="Indeterminate"] <- "stable or inconclusive"
    x
})

不确定这是否优雅！

比较R ......中的纵向值和扭曲

1 个答案: