R中的复杂重组:字符串,数字和日期

时间:2018-08-28 15:18:35

标签: r tidyr reshape2

我有一个广泛的数据集,其中每行(一个人)针对三个不同日期最多提供 三个观察值。每个观察都由日期,描述和分钟数组成。个人可以根据需要提供任意数量的观察结果,并且可能与其他观察结果一起出现在多行中。

测试数据在这里:

library(RCurl)
fwt <-     getURL("https://raw.githubusercontent.com/bac3917/Cauldron/master/fwt.csv")
fwt<-read.csv(text=fwt)

以正确的格式转换列:

library(lubridate)
fwt$date1<-as.Date(fwt$date1, format='%m/%d/%Y')
fwt$date2<-as.Date(fwt$date2, format='%m/%d/%Y')
fwt$date3<-as.Date(fwt$date3, format='%m/%d/%Y')
# condense dataset; 3 sets of columns into 1
cols <- names(fwt) %in%     c("naecy1_2","naecy1_1","naecy1_3","naecy1_4","naecy1_5","naecy1_6",
          "naecy2_2","naecy2_1","naecy2_3","naecy2_4","naecy2_5","naecy2_6",
          "naecy3_2","naecy3_1","naecy3_3","naecy3_4","naecy3_5","naecy3_6")


fwt[cols]<-lapply(fwt[cols], as.numeric) #convert to numeric all
fwt[is.na(cols)]<-0

基本上,需要将三组日期/说明/分钟存储为长格式。我希望重组后的数据看起来像这样:

Name   Date  NAECY1  NAECY2  NAECY3  NAECY4  NAECY5  NAECY6

我已经尝试过reshape2tidyr,但无法解决。有想法吗?

谢谢...

1 个答案:

答案 0 :(得分:1)

这是一个快速的解决方案:

cols <- c("name", "date%d","descr%d", "naecy%d_1", "naecy%d_2", "naecy%d_3", "naecy%d_4", "naecy%d_5", "naecy%d_6")
cols_renamed <- c("Name   Date Descr  NAECY1  NAECY2  NAECY3  NAECY4  NAECY5  NAECY6") %>% strsplit("\\W+") %>% unlist

new_fwt <- lapply(1:3, function(i) {
  df <- fwt[,sprintf(cols, i)]
  colnames(df) <- cols_renamed
  df
}) %>% do.call(rbind, .)