我有一个名为RawHM的data.frame,并且希望每行评估列表AllList中的条目定义的列集,以查看是否有足够的非NA观察值(不小于2)保留该行的列条目集。如果不是,则列集条目应替换为NA。
AllList:
> dput(AllList)
structure(list(EGI = c("OO", "PP", "QQ"), Ref = c("RR", "SS",
"TT")), .Names = c("EGI", "Ref"))
RawHM:
> dput(head(RawHM,10))
structure(list(OO = c(2.26128283268031, NA, NA, NA, 3.1189673217816,
2.68131772865193, 1.50542478607416, NA, NA, NA), PP = c(NA, 2.86537733048028,
2.02969026818987, NA, 2.54112005565494, 3.01623803266379, 1.73909499803785,
2.49712237003491, NA, 1.67635525591635), QQ = c(NA, NA, 1.91968060122123,
NA, NA, 2.63463138625395, NA, NA, NA, NA), RR = c(NA, NA, NA,
NA, NA, 1.01488582084669, 1.01944283768403, NA, 1.06329113924051,
NA), SS = c(0.950310559006211, 0.924124326404927, 1.07886334610473,
0.951793999929161, 0.847931452310888, 0.879173290937997, 0.882126364182319,
NA, NA, 0.713085668766746), TT = c(NA, NA, 1.09812749411644,
NA, 0.9994646420402, 1.21090641120118, 1.25090285854196, NA,
NA, NA)), .Names = c("OO", "PP", "QQ", "RR", "SS", "TT"), row.names = c(1L,
2L, 15L, 16L, 23L, 24L, 25L, 30L, 36L, 40L), class = "data.frame")
我尝试过制作一个功能:
func<-function(x)unlist(lapply(AllList,function(y)if(length(na.omit(x[unlist(y)]))<2){rep(NA,length(unlist(y)))} else{x[unlist(y)]}))
然后:
output<-t(apply(RawHM,1,func))
哪个在原理中有效,但不保留colnames,我希望它与RawHM数据帧中的相同。我宁愿避免以后重命名列。
> dput(head(output,10))
structure(c(NA, NA, NA, NA, 3.1189673217816, 2.68131772865193,
1.50542478607416, NA, NA, NA, NA, NA, 2.02969026818987, NA, 2.54112005565494,
3.01623803266379, 1.73909499803785, NA, NA, NA, NA, NA, 1.91968060122123,
NA, NA, 2.63463138625395, NA, NA, NA, NA, NA, NA, NA, NA, NA,
1.01488582084669, 1.01944283768403, NA, NA, NA, NA, NA, 1.07886334610473,
NA, 0.847931452310888, 0.879173290937997, 0.882126364182319,
NA, NA, NA, NA, NA, 1.09812749411644, NA, 0.9994646420402, 1.21090641120118,
1.25090285854196, NA, NA, NA), .Dim = c(10L, 6L), .Dimnames = list(
c("1", "2", "15", "16", "23", "24", "25", "30", "36", "40"
), NULL))
非常欢迎任何帮助:-) 问候 MADS
答案 0 :(得分:0)
func
是一个非常奇怪的功能......甚至是时髦的!
当您使用apply
时,您的数据会从data.frame转换为矩阵。如果它是data.frame而不是矩阵,那么你的函数似乎运行方式不同:
func(RawHM[1,])
EGI.OO EGI.PP EGI.QQ Ref.RR Ref.SS Ref.TT
2.2612828 NA NA NA 0.9503106 NA
func(as.matrix(RawHM)[1,])
EGI1 EGI2 EGI3 Ref1 Ref2 Ref3
NA NA NA NA NA NA
请注意,您会得到不同的结果和不同的名称!
在任何情况下,名称问题都源于这样一个事实:当您生成NA时,没有名称,因此结果会为apply
提供不一致的输出。要解决这个问题,这里有一个修改:
func2 <- function(x)unlist(lapply(AllList,function(y)if(length(na.omit(x[unlist(y)]))<2){sapply(y,function(z) NA)} else{x[unlist(y)]}))
t(apply(RawHM,1,func2))
EGI.OO EGI.PP EGI.QQ Ref.RR Ref.SS Ref.TT
1 NA NA NA NA NA NA
2 NA NA NA NA NA NA
15 NA 2.029690 1.919681 NA 1.0788633 1.0981275
16 NA NA NA NA NA NA
23 3.118967 2.541120 NA NA 0.8479315 0.9994646
24 2.681318 3.016238 2.634631 1.014886 0.8791733 1.2109064
25 1.505425 1.739095 NA 1.019443 0.8821264 1.2509029
30 NA NA NA NA NA NA
36 NA NA NA NA NA NA
40 NA NA NA NA NA NA