我想合并以下数据框,以便每行包含数据点的列名和数据点。
non.MML X2.MML X3.MML X4.MML X5.MML X6.7.MML
-13.994 NA NA NA NA NA
NA -13.992 NA NA NA NA
NA NA -13.984 NA NA NA
NA NA NA -13.983 NA NA
NA NA NA NA -13.962 NA
NA NA NA NA NA NA -13.907
NA NA -1.2 NA NA NA
NA NA NA -14.2 NA NA
NA NA NA NA -11.01 NA
NA NA NA NA NA NA -17.23
这是我想要的:
name score
non.MML -13.994
X2.MML -13.992
X3.MML -13.984
X4.MML -13.983
X5.MML -13.962
X6.7.MML -13.907
X3.MML -1.2
X4.MML -14.2
X5.MML -11.01
X6.7.MML -17.23
我尝试使用它,它让我接近我想要的东西:
mydata <- data.frame(x=unlist(mydata))
但我明白了:
x
non.MML1 -13.994
X2.MML1 -13.992
X3.MML1 -13.984
X4.MML1 -13.983
X5.MML1 -13.962
X6.7.MML1 -13.907
X3.MML2 -1.2
X4.MML2 -14.2
X5.MML2 -11.01
X6.7.MML2 -17.23
您可以注意到每行的第一个元素都是用数字修改的,因为有多个重复。什么是实现我想要的输出的最佳方法?
答案 0 :(得分:1)
使用melt
中的reshape2
:
reshape2::melt(df, na.rm = TRUE, variable.name = "name", value.name = "score")
# name score
#1 non.MML -13.994
#12 X2.MML -13.992
#23 X3.MML -13.984
#27 X3.MML -1.200
#34 X4.MML -13.983
#38 X4.MML -14.200
#45 X5.MML -13.962
#49 X5.MML -11.010
#56 X6.7.MML -13.907
#60 X6.7.MML -17.230
或使用baseR stack
函数:
setNames(na.omit(stack(df)), c("score", "name"))
# score name
#1 -13.994 non.MML
#12 -13.992 X2.MML
#23 -13.984 X3.MML
#27 -1.200 X3.MML
#34 -13.983 X4.MML
#38 -14.200 X4.MML
#45 -13.962 X5.MML
#49 -11.010 X5.MML
#56 -13.907 X6.7.MML
#60 -17.230 X6.7.MML