传播按ID分组但观察结果不同的数据

时间:2020-05-20 20:07:30

标签: r dplyr spread

我有以下数据:

exportAs

“ CaseID”行构成一个案例,其中可能包含所有相同药物或不同类型药物的观察结果。我希望这些数据如下所示:

drugData <- data.frame(caseID=c(9, 9, 10, 11, 12, 12, 12, 12, 13, 45, 45, 225),
            Drug=c("Cocaine", "Cocaine", "DPT", "LSD", "Cocaine", "LSD", "Heroin","Heroin", "LSD", "DPT", "DPT", "Heroin"),
             County=c("A", "A", "B", "C", "D", "D", "D","D", "E", "F", "F", "G"),
             Date=c(2009, 2009, 2009, 2009, 2011, 2011, 2011, 2011, 2010, 2010, 2010, 2005))

我尝试使用dplyr传播函数,但似乎无法完全使它起作用。谢谢!

1 个答案:

答案 0 :(得分:0)

基于'caseID'创建序列列后,我们可以将其转换为宽格式

  var cal = CalendarApp.getCalendarById("/* supply your calendar ID here*/");
  if (!cal) {
    Logger.log("Calendar not found for ID:" + calendarID);
  } else {
    var calEvents = cal.getEvents(new Date("March 8, 2010"), new Date("March 14, 2025"));
    // Store them by eventId for easy access later
    var calEventMap = new Map();
    for (var j in calEvents) {
      calEventMap.set(calEvents[j].getId(), calEvents[j]);
    }

    /* Later when you need to look up by iCalID... */

    var calEvent = calEventMap.get(eventID);

  }

或使用library(dplyr) library(tidyr) library(stringr) library(data.table) drugData %>% mutate(nm = str_c('Drug', rowid(caseID))) %>% pivot_wider(names_from = nm, values_from = Drug) #A tibble: 7 x 7 # caseID County Date Drug1 Drug2 Drug3 Drug4 # <dbl> <fct> <dbl> <fct> <fct> <fct> <fct> #1 9 A 2009 Cocaine Cocaine <NA> <NA> #2 10 B 2009 DPT <NA> <NA> <NA> #3 11 C 2009 LSD <NA> <NA> <NA> #4 12 D 2011 Cocaine LSD Heroin Heroin #5 13 E 2010 LSD <NA> <NA> <NA> #6 45 F 2010 DPT DPT <NA> <NA> #7 225 G 2005 Heroin <NA> <NA> <NA> (不建议使用spread代替spread

pivot_wider

或使用drugData %>% mutate(nm = str_c('Drug', rowid(caseID))) %>% spread(nm, Drug)

data.table