例如,这是我的数据:
age gender education previous_comp_exp tutorial_time
19 Male Undergraduate casual gamer 37.93436
qID time_taken time_to_first_interaction num_of_interactions answered_correctly
sor0 26.27869 14.82285 4 TRUE
sor5 23.53426 7.422562 8 TRUE
sor4 24.84502 10.41148 4 TRUE
我想转换成这个:
age gender education previous_comp_exp tutorial_time qID.1 time_taken.1 time_to_first_interaction.1 num_of_interactions.1 answered_correctly.1 qID.2 time_taken.2 time_to_first_interaction.2 num_of_interactions.2 answered_correctly.2 qID.3 time_taken.3 time_to_first_interaction.3 num_of_interactions.3 answered_correctly.3
19 Male Undergraduate casual gamer 37.93436 sor0 26.27869 14.82285 4 TRUE sor5 23.53426 7.422562 8 TRUE sor4 24.84502 10.41148 4 TRUE
总而言之,我希望第3行重复次数并添加到标题中。其次,我希望第3行下面的每一行都移动到相应重复列下的第一行。
任何想法从哪里开始?
答案 0 :(得分:1)
从第3行开始,dcast
有一个选项,列名为第2行
library(data.table)
df2 <- setNames(df1[3:nrow(df1),], unlist(df1[2,]))
setDT(df2)[, ind := 1]
r1 <- dcast(df2, ind ~ rowid(ind),
value.var = setdiff(names(df2), 'ind'), sep='.')[, ind := NULL][]
res <- data.table(df1[1,], r1[, order(as.numeric(sub(".*\\.", "", names(r1)))), with = FALSE])
res
# age gender education previous_comp_exp tutorial_time qID.1 time_taken.1 time_to_first_interaction.1 num_of_interactions.1 answered_correctly.1 qID.2
#1: 19 Male Undergraduate casual gamer 37.93436 sor0 26.27869 14.82285 4 TRUE sor5
# time_taken.2 time_to_first_interaction.2 num_of_interactions.2 answered_correctly.2 qID.3 time_taken.3 time_to_first_interaction.3 num_of_interactions.3
#1: 23.53426 7.422562 8 TRUE sor4 24.84502 10.41148 4
# answered_correctly.3
#1: TRUE