我有一个数据库,我专注于一个名为DB的矩阵,如下所示:
PN time.state.2 STATUS
[1,] 6954010001 0 3.0
[2,] 6954010001 3 3.5
[3,] 6954010001 6 3.5
[4,] 6954010001 9 3.5
[5,] 6954010001 12 3.5
其中有许多科目,并且每个科目都登记了多行(对于登记了STATUS的患者的不同访问次数)。
我想创建一个for循环,如果同一患者在后续访问中增加了STATUS的值,则会创建一个名为“progress”的对象。
我不明白如何将患者的PN码分配给索引“i”,以便在患者完成后再进一步。
例如,对于在time.state.2对象突出显示的每个时间点具有这些SCORE值的患者,我希望当患者的SCORE值比该患者的第一个时间点增加1个点时,该患者被认为是进展的(首次到医院就诊)。此后,此进展必须在随后的访问中得到确认(对于该患者,该患者在时间6状态达到4.0(比第一次访问高出1分,为3.0),并且在随后的访问中确认该值,因此确认了进展。)
PN time.state.2 STATUS PROGRESSION
[1,] 6954010001 0 3.0 0
[2,] 6954010001 3 3.5 0
[3,] 6954010001 6 4.0 1
[4,] 6954010001 9 4.0 0
[5,] 6954010001 12 4.5 0
[6,] 6954010001 15 4.5 0
我还希望每位患者的进展只是第一次1,并且可能在他进展后随后下降(对于该患者)。 例如:
PN time.state.2 STATUS PROGRESSION
[1,] 6954010001 0 3.0 0
[2,] 6954010001 3 3.5 0
[3,] 6954010001 6 4.0 1
[4,] 6954010002 0 6.0 0
[5,] 6954010002 3 6.0 0
当第一位患者在PROGRESSION = 1时停止。
答案 0 :(得分:1)
我相信你想要这样的东西:
#create data
DF <- read.table(text=" PN time.state.2 STATUS
[1,] 6954010001 0 3.0
[2,] 6954010001 3 3.5
[3,] 6954010001 6 3.5
[4,] 6954010001 9 3.5
[5,] 6954010001 12 3.5
[6,] 6954010002 0 3.0
[7,] 6954010002 3 3.0
[8,] 6954010002 6 3.5
[9,] 6954010002 9 3.5
[10,] 6954010002 12 3.5",header=TRUE)
#you claim to have a matrix
m <- as.matrix(DF)
#turn the matrix into a data.frame
DF <- as.data.frame(m)
rownames(DF) <- NULL
#use package plyr to split according to patient,
#apply function, and combine back
library(plyr)
#calculate the cumulative sum of differences in STATUS
#put a 0 in front, since there can be no progress at the first time point
DF <- ddply(DF,.(PN),transform,progress=c(0,cumsum(diff(STATUS))))
print(DF)
# PN time.state.2 STATUS progress
# 1 6954010001 0 3.0 0.0
# 2 6954010001 3 3.5 0.5
# 3 6954010001 6 3.5 0.5
# 4 6954010001 9 3.5 0.5
# 5 6954010001 12 3.5 0.5
# 6 6954010002 0 3.0 0.0
# 7 6954010002 3 3.0 0.0
# 8 6954010002 6 3.5 0.5
# 9 6954010002 9 3.5 0.5
# 10 6954010002 12 3.5 0.5
DF <- read.table(text=" PN time.state.2 STATUS
[1,] 6954010001 0 3.0
[2,] 6954010001 3 3.5
[3,] 6954010001 6 4.0
[4,] 6954010001 9 3.5
[5,] 6954010001 12 6.0
[6,] 6954010002 0 3.0
[7,] 6954010002 3 4.0
[8,] 6954010002 6 4.0
[9,] 6954010002 9 6.0
[10,] 6954010002 12 6.0",header=TRUE)
rownames(DF) <- NULL
DF <- ddply(DF,.(PN),transform,progress=(STATUS-STATUS[1])>=1 &
(c(STATUS[-1],FALSE)-STATUS[1])>=1)
DF <- ddply(DF,.(PN),function(x) {x$progress[x$progress][-1] <- FALSE; x})
# PN time.state.2 STATUS progress
# 1 6954010001 0 3.0 FALSE
# 2 6954010001 3 3.5 FALSE
# 3 6954010001 6 4.0 FALSE
# 4 6954010001 9 3.5 FALSE
# 5 6954010001 12 6.0 FALSE
# 6 6954010002 0 3.0 FALSE
# 7 6954010002 3 4.0 TRUE
# 8 6954010002 6 4.0 FALSE
# 9 6954010002 9 6.0 FALSE
# 10 6954010002 12 6.0 FALSE