我是R.的新手。我做了很多研究和测试,以便找到这个问题的优雅答案。我尝试重塑,t,融化等。我也在努力改变变量的名称。 我坚持使用这样的数据框架。我们有问题的时间(在问题1之前),然后在第二行,我们有时间记录答案。
Time Logs
446.6204 Question1
452.7516 4
452.7516 Question2
458.1999 3
458.1999 Question3
460.2342 5
我想将所有内容放在一行上,并使用" Logs"中的值命名变量。运气对我来说,模式是不变的,所以使用切片工作可能会很好。
Respondent TimeQ1 Question1 TimeA1 TimeQ2 Question2 TimeA2 TimeQ3 Question3 TimeA3
Respondent1 446.6204 4 452.7516 452.7516 3 458.1999 458.1999 5 460.2342
感谢您的帮助!
答案 0 :(得分:0)
我为受访者添加了一列,并为多个受访者添加了数据。以下是示例数据集:
DF <- structure(list(Respondent = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("Respondent 1",
"Respondent 2", "Respondent 3"), class = "factor"), Time = c(446.6204,
452.7516, 452.7516, 458.1999, 458.1999, 460.2342, 535.94448,
543.30192, 543.30192, 549.83988, 549.83988, 552.28104, 443.2204,
449.3516, 449.3516, 454.7999, 454.7999, 456.8342), Logs = structure(c(6L,
4L, 7L, 3L, 8L, 5L, 6L, 5L, 7L, 2L, 8L, 3L, 6L, 1L, 7L, 4L, 8L,
5L), .Label = c("1", "2", "3", "4", "5", "Question1", "Question2",
"Question3"), class = "factor")), .Names = c("Respondent", "Time",
"Logs"), row.names = c(NA, -18L), class = "data.frame")
我不认为将数据全部放在一行是您的最佳选择。如果你有很多问题,那么你的排队将会非常长。
这是我之前建议的格式(我仍然认为更好):
newDF <- data.frame(respondent = DF$Respondent[grep("Question", DF$Logs)],
question = as.character(DF$Logs[grep("Question", DF$Logs)]),
questionTime = DF$Time[grep("Question", DF$Logs)],
responseValue = DF$Logs[-grep("Question", DF$Logs)],
responseTime = DF$Time[-grep("Question", DF$Logs)])
newDF
# respondent question questionTime responseValue responseTime
# Respondent 1 Question1 446.6204 4 452.7516
# Respondent 1 Question2 452.7516 3 458.1999
# Respondent 1 Question3 458.1999 5 460.2342
# Respondent 2 Question1 535.9445 5 543.3019
# Respondent 2 Question2 543.3019 2 549.8399
# Respondent 2 Question3 549.8399 3 552.2810
# Respondent 3 Question1 443.2204 1 449.3516
# Respondent 3 Question2 449.3516 4 454.7999
# Respondent 3 Question3 454.7999 5 456.8342
根据受访者的附加列,您可以使用dcast
之类的内容将上表中的数据放入您正在寻找的内容中。以下是步骤:
qTime <- dcast(newDF, respondent ~ question, value.var = "questionTime")
names(qTime)[2:length(names(qTime))] <- paste0("TimeQ", seq(1,length(names(qTime))-1,1) )
rValue <- dcast(newDF, respondent ~ question, value.var = "responseValue")
rTime <- dcast(newDF, respondent ~ question, value.var = "responseTime")
names(rTime)[2:length(names(rTime))] <- paste0("TimeA", seq(1,length(names(rTime))-1,1) )
finalDF <- cbind(qTime, rValue[,-1], rTime[,-1])
finalDF
# respondent TimeQ1 TimeQ2 TimeQ3 Question1 Question2 Question3 TimeA1 TimeA2 TimeA3
# Respondent 1 446.6204 452.7516 458.1999 4 3 5 452.7516 458.1999 460.2342
# Respondent 2 535.9445 543.3019 549.8399 5 2 3 543.3019 549.8399 552.2810
# Respondent 3 443.2204 449.3516 454.7999 1 4 5 449.3516 454.7999 456.8342
如果你真的想要,你必须摆弄列顺序,但一般情况下应该这样做。