我想采取以下数据框架:
loc1 loc2 loc3 1 <NA> Subcortical Basal ganglia 2 Cortical Subcortical Basal ganglia 3 <NA> Subcortical <NA>
并将内容转移到以下内容:
loc1 loc2 loc3 1 Subcortical Basal ganglia <NA> 2 Cortical Subcortical Basal ganglia 3 Subcortical <NA> <NA>
我已尝试过ifelse语句如下,但它太复杂了:
test$loc1 <- ifelse(is.na(test$loc1) & !is.na(test$loc2), "Subcortical", "Cortical") test$loc2 <- ifelse(test$loc1=="Subcortical", NA, "Subcortical") test$loc2 <- ifelse(is.na(test$loc2) & !is.na(test$loc3), "Basal ganglia", "Subcortical")
我也尝过unidite和unite_来自tidyr套餐,但我找不到优雅的答案。在我的研究中,我无法在任何地方找到这个答案,但如果我错过了,我很乐意被引导到一个。谢谢大家。
答案 0 :(得分:1)
apply
的{p> MARGIN=1
(如评论中的@David Arenburg所示)循环遍历行
df1[] <- apply(df1, 1, function(x) x[order(is.na(x))])
或
df1[] <- t( apply(df1, 1, function(x) c(x[!is.na(x)], x[is.na(x)])))
df1
# loc1 loc2 loc3
#1 Subcortical Basal ganglia <NA>
#2 Cortical Subcortical Basal ganglia
#3 Subcortical <NA> <NA>
df1 <- structure(list(loc1 = c(NA, "Cortical", NA),
loc2 = c("Subcortical",
"Subcortical", "Subcortical"), loc3 = c("Basal ganglia",
"Basal ganglia", NA)), .Names = c("loc1", "loc2", "loc3"),
class = "data.frame", row.names = c("1", "2", "3"))
答案 1 :(得分:1)
您可以尝试我的naLast
包中的GitHub-only "SOfun"功能。
用法是:
library(SOfun)
naLast(df1)
# loc1 loc2 loc3
# 1 "Subcortical" "Basal ganglia" NA
# 2 "Cortical" "Subcortical" "Basal ganglia"
# 3 "Subcortical" NA NA
......或者用相同的概念代替......
naLast(df1, by = "col")
# loc1 loc2 loc3
# 1 "Cortical" "Subcortical" "Basal ganglia"
# 2 NA "Subcortical" "Basal ganglia"
# 3 NA "Subcortical" NA
答案 2 :(得分:0)
我认为(未经测试)
out <- alply(dta, 1, function(x) out <- x[!is.na(x)])
使用&#39; plyr&#39;包可以工作。
看起来你正试图对数据帧的每一行做一些事情(称之为dta),所以最好把它作为一个数组来对待,然后对它的第一个边缘(因此作为第二个参数)进行操作。然后,您想按顺序提取每个非NA的元素。这是对的吗?