重塑数据时出错

时间:2017-10-16 20:54:04

标签: r

所以我的数据非常广泛,因此我将数据重新整理为可以分析数据,数据不是分层的,并且它真的很复杂,给你们一个工作的例子,所以我知道这很难回答。

无论如何,我需要连续三次重塑它。

longdata = reshape(widedata,direction="long",varying=Issue,idvar = "Issue.ID")
longdata = reshape(longdata,direction="long",varying=Resolution)
longdata= reshape(longdata,direction="long", varying=Equipment)

前两次工作,第三行重新整形的数据设置方式与前两个完全相同,所以不是说那个向量有点奇怪,我可以改变顺序它仍然会在第三次重塑时抛出此错误。

Error in `row.names<-.data.frame`(`*tmp*`, value = paste(ids, times[i],  : 
duplicate 'row.names' are not allowed  

我试过像这样删除行名称:

longdata = reshape(widedata,direction="long",varying=Issue,idvar = "Issue.ID")
rownames(longdata) <- NULL
longdata = reshape(longdata,direction="long",varying=Resolution)
rownames(longdata) <- NULL
longdata= reshape(longdata,direction="long", varying=Equipment)  

但仍然得到同样的错误。我需要做些什么才能使其发挥作用?

编辑*

我会尝试提供一些示例数据,现在可能会发布很长的帖子,抱歉。

Issue.ID = c("CBICR1Q2201704000", "CBICR1Q2201704001", 
"CBICR1Q2201704002", "CBICR1Q2201704003", "CBICR1Q2201704004", 
"CBICR1Q2201704005", "CBICR1Q2201704006", "CBICR1Q2201704007", 
"CBICR1Q2201704008", "CBICR1Q2201704009", "CBICR1Q2201704010", 
"CBICR1Q2201704011", "CBICR1Q2201704012", "CBICR1Q2201704013", 
"CBICR1Q2201704014", "CBICR1Q2201704015", "CBICR1Q2201704016", 
"CBICR1Q2201704017", "CBICR1Q2201704018", "CBICR1Q2201704019")
Issue.1 = c("Difficulty receiving products in general", 
"Supplier compliance issues", "Supplier fraud, waste, or abuse", 
"Difficulty receiving products in general", "Difficulty receiving products in general", 
"Supplier fraud, waste, or abuse", "Supplier service issues", 
"Problems repairing due to service issues ", "Problems repairing due to service issues ", 
"Other", "Billing, coverage, coordination of benefits", "Problems repairing due to service issues ", 
"Difficulty receiving products in general", "Difficulty receiving products in general", 
"Low quantity/quality", "Difficulty receiving products in general", 
"Difficulty receiving products in general", "Supplier service issues", 
"Problems repairing due to service issues ", "Problems repairing due to service issues ")
Issue.2 = c("Supplier compliance issues", "Billing, coverage, coordination of benefits", 
"Supplier service issues", "Supplier service issues", "Low quantity/quality", NA, "DMEPOS information issues", "Supplier fraud, waste, or abuse", 
"Supplier compliance issues", NA, "DMEPOS information issues", 
"Supplier compliance issues", "Supplier compliance issues", "Supplier service issues", 
"Supplier service issues", "Supplier service issues", "Supplier service issues", 
"DMEPOS information issues", NA, "Supplier compliance issues")

Equipment.1 = c("Oxygen Supplies/Equipment", 
"Continuous Positive Airway Pressure (CPAP) / Respiratory Assist Device (RAD)", 
"Nebulizers", "Lifts", "Oxygen Supplies/Equipment", "Walking Aids", 
"Power Mobility Devices (PMDs) other than scooter", "Power Mobility Devices (PMDs) other than scooter", 
"Continuous Positive Airway Pressure (CPAP) / Respiratory Assist Device (RAD)", 
"Continuous Positive Airway Pressure (CPAP) / Respiratory Assist Device (RAD)", 
"Continuous Positive Airway Pressure (CPAP) / Respiratory Assist Device (RAD)", 
"Walking Aids", "Hospital beds", "Power Mobility Devices (PMDs) other than scooter", 
"Oxygen Supplies/Equipment", "Hospital beds", "Oxygen Supplies/Equipment", 
"Continuous Positive Airway Pressure (CPAP) / Respiratory Assist Device (RAD)", 
"Power Mobility Devices (PMDs) other than scooter", "Power Mobility Devices (PMDs) other than scooter"
)
Equipment.2 = c(NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_)

Resolution.1 = c("Current supplier resolved the issue", 
"Current supplier resolved the issue", "Current supplier resolved the issue", 
"Supplier educated about inquiry\n", "Beneficiary educated about inquiry ", 
"Supplier educated about inquiry\n", "Beneficiary educated about DMEPOS\n", 
"Beneficiary educated about inquiry ", "Beneficiary educated about inquiry ", 
"Beneficiary educated about inquiry ", "Beneficiary educated about suppliers", 
"The case unresolved ", "The case unresolved ", "Beneficiary educated about DMEPOS\n", 
"Current supplier resolved the issue", "Current supplier resolved the issue", 
"Beneficiary educated about DMEPOS\n", "Beneficiary educated about suppliers", 
"New supplier found ", "Beneficiary educated about suppliers"
)
Resolution.2 = c(NA, NA, NA, "Current supplier resolved the issue", 
NA, "Reimbursement or refund ", "Supplier educated about DMEPOS_x000D_\n", 
"Beneficiary educated about suppliers", "Beneficiary educated about DMEPOS\n", 
"Current supplier resolved the issue", "New supplier found ", 
"Beneficiary educated about DMEPOS\n", NA, "Beneficiary educated about suppliers", 
"Beneficiary educated about inquiry ", "Supplier educated about inquiry_x000D_\n", 
"Beneficiary educated about inquiry ", "New supplier found ", 
NA, "Supplier educated about inquiry\n")

widedata<-data.frame(Issue.ID,Issue.1,Issue.2,Resolution.1,Resolution.2,Equipment.1,Equipment.2)
Issue <- c("Issue.1","Issue.2")
Equipment <- c("Equipment.1","Equipment.2")
Resolution <- c("Resolution.1","Resolution.2")

1 个答案:

答案 0 :(得分:0)

我认为我们可以使用data.table包和melt执行此操作。在我看来,问题,设备和解决方案都在一起,因此我们使用RegEx模式定义meas参数以正确地聚合所有内容。 Values只需重命名已熔化的列。

通过这样做,我们最终分别为问题,设备和解决方案提供了一个列,每个Issue.1Issue.2等都成为行。

require(data.table)


setDT(widedata)

 df1 <- melt(widedata, id="Issue.ID", meas =patterns("^Issue\\.\\d+", "^Equipment.*", "^Resolution.*"), 
  value= c("Issue", "Equipment", "Resolution"))[order(Issue.ID)]

head(df1)

            Issue.ID variable                                       Issue                                                                    Equipment                          Resolution
1: CBICR1Q2201704000        1    Difficulty receiving products in general                                                    Oxygen Supplies/Equipment Current supplier resolved the issue
2: CBICR1Q2201704000        2                  Supplier compliance issues                                                                           NA                                  NA
3: CBICR1Q2201704001        1                  Supplier compliance issues Continuous Positive Airway Pressure (CPAP) / Respiratory Assist Device (RAD) Current supplier resolved the issue
4: CBICR1Q2201704001        2 Billing, coverage, coordination of benefits                                                                           NA                                  NA
5: CBICR1Q2201704002        1             Supplier fraud, waste, or abuse                                                                   Nebulizers Current supplier resolved the issue
6: CBICR1Q2201704002        2                     Supplier service issues                                                                           NA                                  NA
> 

另请注意,melt最初是reshape2函数,但data.table已实现了具有更多功能的版本,在这种情况下,我们正在利用定义{{ 1}}。