发布第二个问题,因为我的第一个被标记为重复。如果已经有一个针对这个特定问题的问题,我谨此致歉。
我从一个数据帧开始如下:
dat<-data.frame(
ID=c(100,101,101,101,102,103),
DEGREE=c("BA","BA","MS","PHD","BA","BA"),
YEAR=c(1980,1990, 1992, 1996, 2000, 2004))
> dat
ID DEGREE YEAR
100 BA 1980
101 BA 1990
101 MS 1992
101 PHD 1996
102 BA 2000
103 BA 2004
ID 101于1990年获得文学学士学位,1992年获得硕士学位,1996年获得博士学位。
我想将此数据框重塑成最终看起来像这样的宽格式:
ID DEGREE_1 DEGREE_2 DEGREE_3 YEAR_DEGREE_1 YEAR_DEGREE_2 YEAR_DEGREE_3
100 BA 1980
101 BA MS PHD 1990 1992 1996
102 BA 2000
103 BA 2004
在原始问题答案的帮助下,我尝试使用以下代码创建新的数据框:
dat$DEGREE<-as.character(dat$DEGREE)
dat %>% group_by(ID) %>%
mutate(DegreeNum = paste("Degree", row_number(), sep = "_"))%>%
mutate(DegreeYear = paste("YearDegree", row_number(), sep = "_"))%>%
spread(DegreeNum, DEGREE, fill = "")%>%
spread(DegreeYear,YEAR,fill="")%>%
as.data.frame()
ID Degree_1 Degree_2 Degree_3 YearDegree_1 YearDegree_2 YearDegree_3
100 BA 1980
101 PHD 1996
101 MS 1992
101 BA 1990
102 BA 2000
103 BA 2004
就我所能达到的程度,但无法弄清楚如何将其重塑为数据框,以使ID 101中的所有内容都位于一行中。任何帮助,将不胜感激。
答案 0 :(得分:0)
使用tidyverse并不难...
df<-data.frame(ID=c(100,101,101,101,102,103),
DEGREE=c("BA","BA","MS","PHD","BA","BA"),
YEAR=c(1980,1990, 1992, 1996, 2000, 2004),
stringsAsFactors=FALSE)
df1 <- df %>% select(-3) %>% group_by(ID) %>% mutate(i=row_number()) %>%
as.data.frame() %>%
reshape(direction="wide",idvar="ID",v.names="DEGREE",timevar="i",sep="_")
df1[is.na(df1)] <- ""
df2 <- df %>% select(-2) %>% group_by(ID) %>% mutate(i=row_number()) %>%
as.data.frame() %>%
reshape(direction="wide",idvar="ID",v.names="YEAR",timevar="i",sep="_")
df2[is.na(df2)] <- ""
inner_join(df1,df2,"ID")
# ID DEGREE_1 DEGREE_2 DEGREE_3 YEAR_1 YEAR_2 YEAR_3
#1 100 BA 1980
#2 101 BA MS PHD 1990 1992 1996
#3 102 BA 2000
#4 103 BA 2004