我试图根据R中那些列中的日期对各个患者的列进行排序。我做了一个示例数据集,但是,数据集不返回日期,而是长数(不知道为什么)。原谅我创建数据框的愚蠢方式:)...
dd<-
data.frame(rbind(
c(as.POSIXct(as.Date("01/01/2008", format="%d/%m/%Y")),
as.POSIXct(as.Date("01/01/2009", format="%d/%m/%Y")),
as.POSIXct(as.Date("01/01/2011", format="%d/%m/%Y")),
as.POSIXct(as.Date("01/01/2010", format="%d/%m/%Y")))
,
c(as.POSIXct(as.Date("01/01/2002", format="%d/%m/%Y")),
as.POSIXct(as.Date("01/01/2001", format="%d/%m/%Y")),
as.POSIXct(as.Date("01/01/2006", format="%d/%m/%Y")),
as.POSIXct(as.Date("01/01/2004", format="%d/%m/%Y")))
))
dd$patient[1] <- 1
dd$patient[2] <- 2
names(dd) <- c("date1", "date2", "date3", "date4", "patient")
我所追求的是每位患者的列名列表,按这些列中的日期排序。因此,
患者1:date1,date2,date4,date3
患者2:date2,date1,date4,date3
编辑:
所以,还有一件事。如果缺少一个日期怎么办...因此:
dd <- data.frame(
patient = 1:2,
date1 = as.Date(c("01/01/2008","01/01/2002"),format="%d/%m/%Y"),
date2 = as.Date(c("01/01/2009","01/01/2001"),format="%d/%m/%Y"),
date3 = as.Date(c("01/01/2011","01/01/2006"),format="%d/%m/%Y"),
date4 = as.Date(c("01/01/2010","01/01/2004"),format="%d/%m/%Y")
)
dd[2,2]<- NA
马修斯回答:
> t(apply(dd, 1, function(x) c(x[1], names(x[-1])[order(x[-1])])))
patient
[1,] "1" "date1" "date2" "date4" "date3"
[2,] "2" "date2" "date4" "date3" "date1"
因此,缺失数据点的列名称包含在最后日期的排序列表中。但是id不喜欢它...因此:
patient
[1,] "1" "date1" "date2" "date4" "date3"
[2,] "2" "date2" "date4" "date3"
答案 0 :(得分:2)
这是apply
的应用程序,用于遍历数据框:
t(apply(dd, 1, function(x) c(x[length(x)], names(x)[order(x[-length(x)])])))
## patient
## [1,] "1" "date1" "date2" "date4" "date3"
## [2,] "2" "date2" "date1" "date4" "date3"
如果patient
是第一列而不是最后一列,可能会更有意义。
使用@thelatemail的定义而不是你的定义:
t(apply(dd, 1, function(x) c(x[1], names(x[-1])[order(x[-1])])))
## patient
## [1,] "1" "date1" "date2" "date4" "date3"
## [2,] "2" "date2" "date1" "date4" "date3"
对于编辑过的问题,除非使用NA作为缺失值,否则不能将其表示为数据框或矩阵,这是合理的。但相反,这里是如何将列表作为返回值,因为列表可以有可变长度的条目:
apply(dd, 1, function(x) c(x[1], names(x[-1][!is.na(x[-1])])[order(x[-1][!is.na(x[-1])])]))
## [[1]]
## patient
## "1" "date1" "date2" "date4" "date3"
##
## [[2]]
## patient
## "2" "date2" "date4" "date3"
答案 1 :(得分:1)
使用by
的另一次尝试:
dd <- data.frame(
patient = 1:2,
date1 = as.Date(c("01/01/2008","01/01/2002"),format="%d/%m/%Y"),
date2 = as.Date(c("01/01/2009","01/01/2001"),format="%d/%m/%Y"),
date3 = as.Date(c("01/01/2011","01/01/2006"),format="%d/%m/%Y"),
date4 = as.Date(c("01/01/2010","01/01/2004"),format="%d/%m/%Y")
)
by(dd,dd$patient,function(x) names(x[,order(x)]))
导致:
dd$patient: 1
[1] "patient" "date1" "date2" "date4" "date3"
------------------------------------------------------------
dd$patient: 2
[1] "patient" "date2" "date1" "date4" "date3"
要编辑它以摆脱第一个“患者”列,这将起作用:
by(dd,dd$patient,function(x) c(x[,1],names(x[,order(x[,2:ncol(x)])])))
导致:
dd$patient: 1
[1] "1" "date1" "date2" "date4" "date3"
------------------------------------------------------------------------------
dd$patient: 2
[1] "2" "date2" "date1" "date4" "date3"