我有一个数据框,我试图将其从多行压缩成一行。数据集相当大,但我从一个小子集开始。所以在这里我想将2行变成1;我希望该信息遵循第一行中的信息。
最初的问题是我有一列数据需要“展平”,以便可以使用零碎的数据。该列为JSON格式。
"[{\"task\":\"T0\",\"task_label\":\"Did any birds visit the feeding platform or bird feeders?\",\"value\":\"**Yes**—but there were no displacements. Next, enter all of the birds you see at the feeders. \"},{\"task\":\"T1\",\"value\":[{\"choice\":\"EUROPEANSTARLING\",\"answers\":{\"WHATISTHELARGESTNUMBEROFINDIVIDUALSTHATYOUSAWSIMULTANEOUSLY\":\"4\"},\"filters\":{}},{\"choice\":\"MOURNINGDOVE\",\"answers\":{\"WHATISTHELARGESTNUMBEROFINDIVIDUALSTHATYOUSAWSIMULTANEOUSLY\":\"2\"},\"filters\":{}}]},{\"task\":\"T6\",\"task_label\":\"Is it actively precipitating (rain or snow)?\",\"value\":[\"Yes.\"]}]"
因此,我使用了另一个编码器开发的代码来按任务“平整”该代码。然后,我想将其备份,以便每个分类都有一行信息。
当前,我已经合并了任务T0和T4,但是我需要将其合并到另一个任务T5。为此,我需要将T0和T4合并中的数据减少到一行。因此,现在我正在处理数据的一小部分,并具有一个基本上如下所示的表:
x <- data.frame("subject_ids" = c(19232716, 19232716), "classification_id" = c(120545061,120545061), "task_index.x" = c(1,1),
"task.x" = c("TO","TO"), "value" = c("Displacement","Displacement"), "task_index.y"=c(2,5), "task.y"= c("T4, T4","T4"),
"total.species"=c("2,2","1"), "choice" = c("MOURNINGDOVE, COMMONGRACKLE","MOURNINGDOVE"), "S_T"=c("Target,Target","Target,Source"))
但是我希望它看起来像这样:
y <- data.frame("subject_ids" = c(19232716), "classification_id" = c(120545061), "task_index.x" = c(1),
"task.x" = c("TO"), "value" = c("Displacement"), "task_index.y"=c(2), "task.y"= "T4, T4",
"total.species"=c("2,2"), "choice" = c("MOURNINGDOVE, COMMONGRACKLE"), "S_T"=c("Target,Target"),
"task_index.y"=c(5), "task.y"= "T4",
"total.species"=c("1"), "choice" = c("MOURNINGDOVE"), "S_T"=c("Target,Source"))