这是我的难题。我正试图每天跟踪我研究中的患者状况。我目前已经构建了一个执行此操作的代码,输出如下所示:
P1 Waitlisted
P80 Lab Appointment
P19 Lab Appointment
P26 Waitlisted
我正在试图找出如何区分我今天运行的报告与昨天运行的报告之间的区别,以便基本上快速跟踪列表中出现的任何新患者或已经出现的任何患者除去。因此,如果第二天,我的数据框是
P20 Waitlisted
P1 Waitlisted
P80 Lab Appointment
P19 Lab Appointment
P5 Lab Appointment
P26 Waitlisted
我会得到输出:
P20 Waitlisted
P5 Lab Appointment
如果结果是
那么两者之间的差异或第二天的差异 P1 Waitlisted
P80 Lab Appointment
P80 Waitlisted
P19 Lab Appointment
P26 Waitlisted
输出将生成:
P80 Waitlisted
如果病人在前一天被从我的名单中删除,我也想谈谈,如果我得到像
这样的输出 P1 Waitlisted
P80 Lab Appointment
P26 Waitlisted
有一种方法可以知道今天P19 Lab Appointment已不在我的名单中。
我尝试过以下代码,但我只能得到逻辑因素而无法知道什么是真假。
>apply(apply(df1,2,`==`,df2),1,any)
[1] FALSE TRUE FALSE FALSE FALSE NA NA NA TRUE TRUE FALSE
FALSE FALSE FALSE TRUE NA TRUE FALSE FALSE
[20] FALSE NA NA FALSE FALSE TRUE FALSE FALSE FALSE NA TRUE
FALSE FALSE TRUE NA NA TRUE TRUE TRUE
[39] TRUE TRUE NA NA NA TRUE
答案 0 :(得分:2)
您可以使用反连接来获取天数之间的差异。具体在data.table
您可能会这样做:
library(data.table)
setDT(df1); setDT(df2)
removed_patient_status <- df1[!df2, on = c("status", "patient")]
new_patient_status <- df2[!df1, on = c("status", "patient")]
removed_patient_status
#Empty data.table (0 rows) of 2 cols: patient,status
new_patient_status
# patient status
#1: P20 Waitlisted
#2: P5 Lab Appointment
或dplyr
:
library(dplyr)
removed_patient_status <- anti_join(df1, df2, by = c("status", "patient"))
new_patient_status <- anti_join(df2, df1, by = c("status", "patient"))
数据:强>
df1 <- data.frame(patient = c("P1", "P80", "P19", "P26"), status = c("Waitlisted", "Lab Appointment", "Lab Appointment", "Waitlisted"), stringsAsFactors = FALSE)
df2 <- data.frame(patient = c("P20", "P1", "P80", "P19", "P5","P26"), status = c("Waitlisted", "Waitlisted", "Lab Appointment", "Lab Appointment", "Lab Appointment","Waitlisted"), stringsAsFactors = FALSE)
答案 1 :(得分:0)
关于你的第一个问题:
df1 <- data.frame(P = c("P1","P80","P19","P26"), Status=c("Waitlisted","Lab Appointment", "Lab Appointment", "Waitlisted"))
df2 <- data.frame(P = c("P20","P1","P80","P19","P5","P26"), Status=c("Waitlisted","Waitlisted","Lab Appointment","Lab Appointment","Lab Appointment", "Waitlisted"))
df2[!(paste(df2$P, df2$Status) %in% paste(df1$P, df1$Status)),] #removed patients