转置数据后,我想重新分配已删除的属性。这也可能适用于将属性从一个数据帧复制到另一个数据帧。或在突变等之后复制属性,将其放置在其中。
library(reshape2)
df <- data.frame(id = c(1,2,3,4,5),
time = c(11, 22,33,44,55),
c = c(1,2,3,5,5),
d = c(4,2,5,4,NA))
attr(df$id,"label")<- "label"
attr(df$time,"label")<- "label2"
attr(df$c,"label")<- "something here"
attr(df$d,"label")<- "count of something"
str(df)
str(df)
data.frame': 5 obs. of 4 variables:
$ id : num 1 2 3 4 5
..- attr(*, "label")= chr "label"
$ time: num 11 22 33 44 55
..- attr(*, "label")= chr "label2"
$ c : num 1 2 3 5 5
..- attr(*, "label")= chr "something here"
$ d : num 4 2 5 4 NA
..- attr(*, "label")= chr "count of something"
从头到尾
dfwide<- recast(df,id~variable +time,
id.var = c("id","time"))
常用属性丢失消息:
Warning message:
attributes are not identical across measure variables; they will be dropped
str(dfwide)
'data.frame': 5 obs. of 11 variables:
$ id : num 1 2 3 4 5
$ c_11: num 1 NA NA NA NA
$ c_22: num NA 2 NA NA NA
$ c_33: num NA NA 3 NA NA
$ c_44: num NA NA NA 5 NA
$ c_55: num NA NA NA NA 5
$ d_11: num 4 NA NA NA NA
$ d_22: num NA 2 NA NA NA
$ d_33: num NA NA 5 NA NA
$ d_44: num NA NA NA 4 NA
$ d_55: num NA NA NA NA NA
使用mostattributes
可以在数据框之间复制属性,但是对于许多列名称的迭代,我无法弄清楚或考虑如何以一种不同的方式高效地对此进行映射,一一保存。
mostattributes(dfwide$c_11)<-attributes(df$c)
mostattributes(dfwide$c_22)<-attributes(df$c)
> str(dfwide)
'data.frame': 5 obs. of 11 variables:
$ id : num 1 2 3 4 5
$ c_11: num 1 NA NA NA NA
..- attr(*, "label")= chr "something here"
$ c_22: num NA 2 NA NA NA
..- attr(*, "label")= chr "something here"
$ c_33: num NA NA 3 NA NA
我试图使其自动化,但失败了(所有c都应具有相同的标签,而d均应具有相同的标签):
#extract arguments
dlist<-enframe(names(df))%>%
slice(-1,-2)%>%
pull(., value)
dlist
dlistw<-enframe(names(dfwide))%>%
slice(-1)%>%
pull(., value)
dlistw
#function
mostatt<- function(var1, var2) {
mostattributes(dfwide[[var1]])<<-attributes(df[[var2]])
}
mapply(mostatt,dlistw,dlist)
str(dfwide)
'data.frame': 5 obs. of 11 variables:
$ id : num 1 2 3 4 5
$ c_11: num 1 NA NA NA NA
..- attr(*, "label")= chr "something here"
$ c_22: num NA 2 NA NA NA
..- attr(*, "label")= chr "count of something"
$ c_33: num NA NA 3 NA NA
..- attr(*, "label")= chr "something here"
$ c_44: num NA NA NA 5 NA
..- attr(*, "label")= chr "count of something"
$ c_55: num NA NA NA NA 5
..- attr(*, "label")= chr "something here"
$ d_11: num 4 NA NA NA NA
..- attr(*, "label")= chr "count of something"
$ d_22: num NA 2 NA NA NA
..- attr(*, "label")= chr "something here"
$ d_33: num NA NA 5 NA NA
..- attr(*, "label")= chr "count of something"
$ d_44: num NA NA NA 4 NA
..- attr(*, "label")= chr "something here"
$ d_55: num NA NA NA NA NA
..- attr(*, "label")= chr "count of something"
我认为使用tidyselect
starts_with
可能值得一试,但不确定如何将其合并。任何建议,将不胜感激。谢谢!
答案 0 :(得分:1)
这是一个选择:
for(i in (setdiff(colnames(df), "id"))){
for(x in colnames(dfwide)[(grepl(i, colnames(dfwide)))])
mostattributes(dfwide[[x]]) <- attributes(df[[i]])
}
mostattributes(dfwide$id) <- attributes(df$id)
由于d
中包含id
,因此我需要在末尾重写id
。
如果您将d
的{{1}}更改为更简单:
e