我在解析数据帧中的数据时遇到问题。我正在使用igraph包来分析每天的模块化。我可以到达那个部分:
Day_1 <- c("18", "34", NA, "25",NA, NA,NA,"39","18","29")
Freq_1 <- c(0.6369427, 1.0615711, NA, 0.6369427,NA, NA,NA ,2.1231423, 0.7500849 ,2.7600849)
Day_2 <- c("22", NA,"17", "14", "5" , "14", NA , "6", NA, NA)
Freq_2 <- c(2.5693731,NA, 23.0729702, 6.4234327, 1.8499486, 6.4234327,NA, 0.8221994, NA, NA)
Day_3 <- c("16", NA , NA , "9", "53", "17", "9" ,"1", "25", NA)
Freq_3 <- c(0.1573317, NA, NA, 5.6324733,0.9125236,13.2158590, 5.6324733,18.7853996 , 0.3461296, NA)
Day_4 <- c("74", "39", "14","35", "75", "59", "27", "47", "54", NA)
Freq_4 <- c(0.1461988, 0.2192982, 0.2923977, 0.3654971, 0.4385965, 0.5116959, 0.8040936, 0.8771930, 0.9502924, NA)
test <- data.frame(Day_1,Freq_1,Day_2,Freq_2,Day_3,Freq_3,Day_4,Freq_4)
test
> test
Day_1 Freq_1 Day_2 Freq_2 Day_3 Freq_3 Day_4 Freq_4
1 18 0.6369427 22 2.5693731 16 0.1573317 74 0.1461988
2 34 1.0615711 <NA> NA <NA> NA 39 0.2192982
3 <NA> NA 17 23.0729702 <NA> NA 14 0.2923977
4 25 0.6369427 14 6.4234327 9 5.6324733 35 0.3654971
5 <NA> NA 5 1.8499486 53 0.9125236 75 0.4385965
6 <NA> NA 14 6.4234327 17 13.2158590 59 0.5116959
7 <NA> NA <NA> NA 9 5.6324733 27 0.8040936
8 39 2.1231423 6 0.8221994 1 18.7853996 47 0.8771930
9 18 0.7500849 <NA> NA 25 0.3461296 54 0.9502924
10 29 2.7600849 <NA> NA <NA> NA <NA> NA
我需要在这些数据之间生成一个链接,以使用Sankey图生成视图。例如:
格式:day_X,day_X + 1,Freq_X + 1
source <- c("Cluster 18A","Cluster 22B","Cluster 16C","Cluster 34A","Cluster 17B", "Cluster 25A","Cluster 14B","Cluster 9C","Cluster 5B", "Cluster 53C","Cluster 14B","Cluster 17C","Cluster 9C","Cluster 39A" ,"Cluster 6B","Cluster 1C", "Cluster 18A", "Cluster 25C")
target <- c("Cluster 22B", "Cluster 16C","Cluster 74D","Cluster 39D","Cluster 14D","Cluster 14B","Cluster 9C","Cluster 35D", "Cluster 53C","Cluster 75D","Cluster 17C","Cluster 59D","Cluster 27D","Cluster 6B", "Cluster 1C","Cluster 47D", "Cluster 25C","Cluster 54D")
value <- c(2.5693731,0.1573317,0.1461988,0.2192982,0.2923977,6.4234327,5.6324733, 0.3654971,0.9125236,0.4385965,13.2158590,0.5116959,0.8040936,0.8221994,18.7853996, 0.8771930,0.3461296 , 0.9502924)
> test2 <- data.frame(source,target,value)
> test2
source target value
1 Cluster 18A Cluster 22B 2.5693731
2 Cluster 22B Cluster 16C 0.1573317
3 Cluster 16C Cluster 74D 0.1461988
4 Cluster 34A Cluster 39D 0.2192982
5 Cluster 17B Cluster 14D 0.2923977
6 Cluster 25A Cluster 14B 6.4234327
7 Cluster 14B Cluster 9C 5.6324733
8 Cluster 9C Cluster 35D 0.3654971
9 Cluster 5B Cluster 53C 0.9125236
10 Cluster 53C Cluster 75D 0.4385965
11 Cluster 14B Cluster 17C 13.2158590
12 Cluster 17C Cluster 59D 0.5116959
13 Cluster 9C Cluster 27D 0.8040936
14 Cluster 39A Cluster 6B 0.8221994
15 Cluster 6B Cluster 1C 18.7853996
16 Cluster 1C Cluster 47D 0.8771930
17 Cluster 18A Cluster 25C 0.3461296
18 Cluster 25C Cluster 54D 0.9502924
我创建了这个函数,但它给我一些我无法解决的错误。
pased_data <- do.call(rbind,lapply(1:nrow(test), function(row){
node <- test[row,]
nomes <- names(node)
node <- node[!is.na(node)]
col_pares <- seq(4,length(node),2)
col_impares <- seq(1,length(node),2)
values <- as.numeric(node[col_pares])
pairs <- paste0(as.character(node[col_impares]),
as.character(nomes[col_impares]))
colsource <- pairs[1:(length(pairs)-1)]
coltarget <- pairs[2:(length(pairs))]
data <- data.frame(source = colsource,
target = coltarget,
value = values)
})
)
错误:
该功能仅适用于4天的分析;
当第二天的NA被删除时,day_3变为day_2;
如果我只有一天的数据,我不知道如何删除该行。 (测试中的第10行)
我需要生成一个等于test2的数据帧,但是需要几天(取决于分析)并且没有错误。 :)