重塑不整洁的数据框架

时间:2013-01-10 16:30:09

标签: r reshape

*编辑以回应评论

我有一个数据集,我正在尝试为分析做准备:

raw<-data.frame(
  name=c("Place 1", "Place 2", "Place 3", "Place 4"),
  x.1.Jan.12=c(1, NA, 0.5, NA),
  Jan.time=c("0900", NA, "0930", NA),
  x.15.Jan.12=c(NA, 0.7, NA, NA),
  Jan.time=c(NA, "1030", NA, NA),
  x.3.Feb.12=c(0.8, 0.6, 0.4, NA),
  Feb.time=c("0715", "0800", "0830", NA),
  x.8.Feb.12=c(NA, NA, 0.65, 0.33),
  Feb.time=c(NA, NA, "?", "1123")
  )

数据应该非常简单:包含结果的位置,结果的日期和收集的时间。如您所见,date已用于命名包含结果的变量。每个'time'变量都与之前的列相关 - 第一个'Jan.time'变量是'x.1.Jan.12'中结果的时间

我想将数据重组为四个变量 - namedatetimevalue。 我很确定reshape2可以做到并且数据已经融化了:

mDat<-melt(raw, id=c("name"))

无法解决后续步骤 - 可能与wierd变量名称有关。

我想要的结果是:

outData<-data.frame(
  name=c("Place 1", "Place 2", "Place 3", "Place 4", "Place 1", "Place 2", "Place 3", "Place 4", "Place 1", "Place 2", "Place 3", "Place 4", "Place 1", "Place 2", "Place 3", "Place 4"),
  date=c("1-Jan-12", "1-Jan-12", "1-Jan-12", "1-Jan-12", "15-Jan-12", "15-Jan-12", "15-Jan-12", "15-Jan-12", "3-Feb-12", "3-Feb-12", "3-Feb-12", "3-Feb-12", "8-Feb-12", "8-Feb-12", "8-Feb-12", "8-Feb-12"),
  value=c(1, NA, 0.5, NA, NA, 0.7, NA, NA, 0.8, 0.6, 0.4, NA, NA, NA, 0.65, 0.33),
  time=c("0900", NA, "0930", NA, NA, "1030", NA, NA, "0715", "0800", "0830", NA, NA, NA, "?", "1123")
)

2 个答案:

答案 0 :(得分:1)

一种选择是在melt()的不同子集上使用“reshape2”中的data.frame。可以使用grep()提取子集。

library(reshape2)
temp <- cbind(
    setNames(melt(raw[c(1, grep("time", names(raw)))], id.vars="name"), 
             c("name", "mon.time", "time")),
    setNames(melt(raw[grep("time", names(raw), invert = TRUE)], id.vars="name"),
             c("name", "date", "result")))
temp[, c("name", "result", "time", "date")]
#       name result time        date
# 1  Place 1   1.00 0900  x.1.Jan.12
# 2  Place 2     NA <NA>  x.1.Jan.12
# 3  Place 3   0.50 0930  x.1.Jan.12
# 4  Place 4     NA <NA>  x.1.Jan.12
# 5  Place 1     NA <NA> x.15.Jan.12
# 6  Place 2   0.70 1030 x.15.Jan.12
# 7  Place 3     NA <NA> x.15.Jan.12
# 8  Place 4     NA <NA> x.15.Jan.12
# 9  Place 1   0.80 0715  x.3.Feb.12
# 10 Place 2   0.60 0800  x.3.Feb.12
# 11 Place 3   0.40 0830  x.3.Feb.12
# 12 Place 4     NA <NA>  x.3.Feb.12
# 13 Place 1     NA <NA>  x.8.Feb.12
# 14 Place 2     NA <NA>  x.8.Feb.12
# 15 Place 3   0.65    ?  x.8.Feb.12
# 16 Place 4   0.33 1123  x.8.Feb.12

答案 1 :(得分:0)

新的一天经常有帮助。我已经成功地设计了一个非重塑的解决方案,但它使用了一个可怕的for循环:

subList<-list()
for(i in seq(2,8,2)){
  temp<-raw[c(1, i, i+1)]
  temp$date<-rep(names(temp)[2], nrow(temp))
  names(temp)<-c("name", "result", "time", "date")
  subList[[i/2]]<-temp
}

solution1<-do.call("rbind", subList)