任何人都可以帮助安排长数据到广泛的数据,但是链接的结果很复杂,即以研究编号标识的宽格式列出这个重复的结果在SN之后以宽格式列出(我已经显示了一个缩写表,结果更多每个患者在底部列出重复的LabTest,LabDate,Result,Lower,Upper)...我尝试过熔化和重铸,并且绑定列但似乎无法使其工作。超过1000个结果重新格式化所以不能手动输入结果需要重新格式化R格式的长数据excel文档谢谢
SN LabTest LabDate Result Lower Upper
TD62 Creat 05/12/2004 22 30 90
TD62 AST 06/12/2004 652 6 45
TD58 Creat 26/05/2007 72 30 90
TD58 Albumin 26/05/2005 22 25 35
TD14 AST 28/02/2007 234 6 45
TD14 Albumin 26/02/2007 15 25 35
格式化数据应如下所示
SN LabTCode LabDate Result Lower Upper LabCode LabDate Result Lower Upper
TD62 Creat 05/12/04 22 30 90 AST 06/12/04 652 6 45
TD58 Creat 26/05/05 72 30 90 Alb 26/05/05 22 25 35
TD14 AST 28/02/07 92 30 90 Alb 26/02/07 15 25 35
Formatted data looks like this
到目前为止,我已经尝试过:
data_wide2 <- dcast(tdl, SN + LabDate ~ LabCode, value.var="Result")
和
melt(tdl, id = c("SN", "LabDate"), measured= c("Result", "Upper", + "Lower"))
答案 0 :(得分:0)
您的问题是R不会喜欢决赛桌,因为它有重复的列名。也许你需要那种格式的数据,但这是一种存储数据的坏方法,因为如果没有大量的手工工作就很难将列重新放回到行中。
也就是说,如果你想这样做,你需要一个新列来帮助你转置数据。
我在下面使用了dplyr和tidyr,值得关注而不是重塑。他们是同一作者,但更现代,设计为'tidyverse'的一部分。
library(dplyr)
library(tidyr)
#Recreate your data (not doing this bit in your question is what got you downvoted)
df <- data.frame(
SN = c("TD62","TD62","TD58","TD58","TD14","TD14"),
LabTest = c("Creat","AST","Creat","Albumin","AST","Albumin"),
LabDate = c("05/12/2004","06/12/2004","26/05/2007","26/05/2005","28/02/2007","26/02/2007"),
Result = c(22,652,72,22,234,15),
Lower = c(30,6,30,25,6,25),
Upper = c(90,45,90,35,45,35),
stringsAsFactors = FALSE
)
output <- df %>%
group_by(SN) %>%
mutate(id_number = row_number()) %>% #create an id number to help with tracking the data as it's transposed
gather("key", "value", -SN, -id_number) %>% #flatten the data so that we can rename all the column headers
mutate(key = paste0("t",id_number, key)) %>% #add id_number to the column names. 't' for 'test' to start name with a letter.
select(-id_number) %>% #don't need id_number anymore
spread(key, value)
SN t1LabDate t1LabTest t1Lower t1Result t1Upper t2LabDate t2LabTest t2Lower t2Result t2Upper
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 TD14 28/02/2007 AST 6 234 45 26/02/2007 Albumin 25 15 35
2 TD58 26/05/2007 Creat 30 72 90 26/05/2005 Albumin 25 22 35
3 TD62 05/12/2004 Creat 30 22 90 06/12/2004 AST 6 652 45
你就在那里,如果你需要特定顺序的列,可能会有一些排序问题仍然存在。