如何将属性从一个数据帧复制到另一个数据帧或如何将属性重新分配给新换位的数据帧-R

时间:2019-03-25 20:30:09

标签: r attr purrr reshape2 tidyselect

转置数据后,我想重新分配已删除的属性。这也可能适用于将属性从一个数据帧复制到另一个数据帧。或在突变等之后复制属性,将其放置在其中。

 library(reshape2)

 df <- data.frame(id = c(1,2,3,4,5), 
                  time = c(11, 22,33,44,55),
                  c  = c(1,2,3,5,5),
                  d = c(4,2,5,4,NA))

attr(df$id,"label")<- "label"
attr(df$time,"label")<- "label2"
attr(df$c,"label")<- "something here"
attr(df$d,"label")<- "count of something"
str(df)

 str(df)
 data.frame':   5 obs. of  4 variables:
 $ id  : num  1 2 3 4 5
  ..- attr(*, "label")= chr "label"
 $ time: num  11 22 33 44 55
  ..- attr(*, "label")= chr "label2"
 $ c   : num  1 2 3 5 5
  ..- attr(*, "label")= chr "something here"
 $ d   : num  4 2 5 4 NA
  ..- attr(*, "label")= chr "count of something"

从头到尾

dfwide<- recast(df,id~variable +time, 
            id.var = c("id","time"))

常用属性丢失消息:

   Warning message:
     attributes are not identical across measure variables; they will be dropped 

 str(dfwide)
'data.frame':   5 obs. of  11 variables:
 $ id  : num  1 2 3 4 5
 $ c_11: num  1 NA NA NA NA
 $ c_22: num  NA 2 NA NA NA
 $ c_33: num  NA NA 3 NA NA
 $ c_44: num  NA NA NA 5 NA
 $ c_55: num  NA NA NA NA 5
 $ d_11: num  4 NA NA NA NA
 $ d_22: num  NA 2 NA NA NA
 $ d_33: num  NA NA 5 NA NA
 $ d_44: num  NA NA NA 4 NA
 $ d_55: num  NA NA NA NA NA

使用mostattributes可以在数据框之间复制属性,但是对于许多列名称的迭代,我无法弄清楚或考虑如何以一种不同的方式高效地对此进行映射,一一保存。

 mostattributes(dfwide$c_11)<-attributes(df$c)
 mostattributes(dfwide$c_22)<-attributes(df$c)
 > str(dfwide)
 'data.frame':  5 obs. of  11 variables:
  $ id  : num  1 2 3 4 5
  $ c_11: num  1 NA NA NA NA
  ..- attr(*, "label")= chr "something here"
  $ c_22: num  NA 2 NA NA NA
  ..- attr(*, "label")= chr "something here"
  $ c_33: num  NA NA 3 NA NA

我试图使其自动化,但失败了(所有c都应具有相同的标签,而d均应具有相同的标签):

#extract arguments
dlist<-enframe(names(df))%>%
   slice(-1,-2)%>%
   pull(., value)
 dlist

 dlistw<-enframe(names(dfwide))%>%
  slice(-1)%>%
  pull(., value)
 dlistw

#function
mostatt<- function(var1, var2) {
  mostattributes(dfwide[[var1]])<<-attributes(df[[var2]])
}

mapply(mostatt,dlistw,dlist)
str(dfwide)

'data.frame':   5 obs. of  11 variables:
 $ id  : num  1 2 3 4 5
 $ c_11: num  1 NA NA NA NA
  ..- attr(*, "label")= chr "something here"
 $ c_22: num  NA 2 NA NA NA
  ..- attr(*, "label")= chr "count of something"
 $ c_33: num  NA NA 3 NA NA
  ..- attr(*, "label")= chr "something here"
 $ c_44: num  NA NA NA 5 NA
  ..- attr(*, "label")= chr "count of something"
 $ c_55: num  NA NA NA NA 5
  ..- attr(*, "label")= chr "something here"
 $ d_11: num  4 NA NA NA NA
  ..- attr(*, "label")= chr "count of something"
 $ d_22: num  NA 2 NA NA NA
  ..- attr(*, "label")= chr "something here"
 $ d_33: num  NA NA 5 NA NA
  ..- attr(*, "label")= chr "count of something"
 $ d_44: num  NA NA NA 4 NA
  ..- attr(*, "label")= chr "something here"
 $ d_55: num  NA NA NA NA NA
  ..- attr(*, "label")= chr "count of something"

我认为使用tidyselect starts_with可能值得一试,但不确定如何将其合并。任何建议,将不胜感激。谢谢!

1 个答案:

答案 0 :(得分:1)

这是一个选择:

for(i in (setdiff(colnames(df), "id"))){
  for(x in colnames(dfwide)[(grepl(i, colnames(dfwide)))])
      mostattributes(dfwide[[x]]) <- attributes(df[[i]])
}
mostattributes(dfwide$id) <- attributes(df$id) 

由于d中包含id,因此我需要在末尾重写id。 如果您将d的{​​{1}}更改为更简单:

e