我从非规范化的ms访问数据库中提取了一些数据(来自旧系统,之前没有执行任何数据查询)。并且由于表中有多个重复的外键,因此在执行连接时数据类似于下表。
--------------------------------------------------------------------------------------------------------------------------
|PtID|ID. NO |Name |DOB |Sex|AdmDate |FK|Diagnosis |DiagCode|
|------------------------------------------------------------------------------------------------------------------------|
|8989|0099999G|John Smith|11/28/1930|M |3/6/2018 11:22 |8989|Atrial fibrillation and flutter |I48 |
|8989|0099999G|John Smith|11/28/1930|M |3/6/2018 11:22 |8989|Bacterial pneumonia, unspecified |J15.9 |
|8989|0099999G|John Smith|11/28/1930|M |3/6/2018 11:22 |8989|Intracardiac thrombosis |I51.3 |
--------------------------------------------------------------------------------------------------------------------------
我试图联合()确实有效的诊断和诊断代码。但是,执行以下功能时的问题是使用扩散功能。
TRY <- MGW %>%
gather(Diagnosis, DiagCode, -(PtID:DiagCode)) %>%
group_by(PtID) %>%
unite(temp, Diagnosis, DiagCode) %>%
mutate(rn = paste0("Diagnosis",row_number())) %>%
spread(rn, temp)
由于某种原因,诊断2正在下面的行中填充,因此不会删除重复项
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|PtID|ID. NO |Name |DOB |Sex|AdmDate |FK|Diagnosis1 |Diagnosis2 |Diagnosis3 |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|8989|0099999G|John Smith|11/28/1930|M |3/6/2018 11:22 |8989|Atrial fibrillation and flutter_I48| Intracardiac thrombosis_I51.3|
|8989|0099999G|John Smith|11/28/1930|M |3/6/2018 11:22 Bacterial pneumonia, unspecified_J15.9| |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
我很感激任何帮助,提前谢谢