如何有效地将行转换为列?在here之前已经问过这个问题,但是当你有一个更大的数据框时它不起作用。我试图制作更大规模的示例数据,但它与我的数据并不完全相同。实际上,我的数据不会一遍又一遍地重复。我的真实数据是6列乘4,000行。下面描述的解决方案适用于模拟示例,但不适用于我的实际数据。
重复是下面描述的条件,其中有多行只在一列中有所不同,在" description_9列"下面的情况下。我想要压缩数据,而不是让多行只有一列是不同的,这样就可以将列拆分成多列。据我所知,它可以是5列。
任何帮助将不胜感激!即使帮助制作一个更现实的例子也会很棒!
#fake data
condition_group <- c("nevous system", "nevous system",
"nevous system","circulatory system problems",
"circulatory system problems")
sub_category <- c("disease A", "disease A",
"disease A","disease B", "disease C")
code_red <- c(1,1,1,2,3)
code_orange <- c("x24", "x24","x24", "j897", "23a")
description_9 <- c("mucous / mucous", "non maternal fetus", "third", "hello", "NA")
description_10 <- c("NA", "NA", "NA", "NA", "blue")
df <- cbind.data.frame(condition_group, sub_category,
code_red, code_orange, description_9, description_10 )
print(df2)
#reshape data
df$fake_id <- c(seq(1, nrow(df)))
idcns <- names(df)[!names(df)%in%c('fake_id','description_9')];
reformatted_data <- reshape(transform(df,fake_id=NULL,time=ave(df$fake_id,df[idcns],FUN=seq_along)),dir='w',idvar=idcns,sep='');
#change na to blanks
df2 <- reformatted_data
df2 <- sapply(df2, as.character) # since your values are `factor`
df2[is.na(df2)] <- ""
print(df2)
# # this doesnt work with lots of data
larger_data <- rbind(df, df, df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df,df)
larger_data <-rbind(larger_data, larger_data, larger_data, larger_data,
larger_data, larger_data, larger_data, larger_data, larger_data, larger_data, larger_data, larger_data,larger_data, larger_data, larger_data, larger_data,larger_data, larger_data, larger_data, larger_data)
df_larger <- rbind(larger_data, larger_data, larger_data, larger_data,larger_data, larger_data, larger_data, larger_data, larger_data)
#reshape data
df_larger$fake_id <- c(seq(1, nrow(df_larger)))
idcns <- names(df_larger)[!names(df_larger)%in%c('fake_id','description_9')];
reformatted_data <- reshape(transform(df_larger,fake_id=NULL,time=ave(df_larger$fake_id,df_larger[idcns],FUN=seq_along)),dir='w',idvar=idcns,sep='');