R中的Riverplot包 - 边列列名中的错误

时间:2016-12-04 23:10:12

标签: r sankey-diagram riverplot

我试图在R中使用Riverplot包制作一个Sankey图,但是我收到了有关边框中列名的错误消息。

我安装了readr和riverplot包,然后执行此操作:

> my_data <- read_csv("~/RProjects/my_data.csv")
>
> edges = rep(my_data, col.names = c("N1","N2","Value"))
>
> nodes = data.frame(ID = unique(c(edges$N1, edges$N2)))
>
> river <- makeRiver(nodes, edges)
>
> return(plot(river))

但在倒数第二个命令设置河流对象&#34;河&#34;我收到这个错误:

Error in checkedges(edges, nodes$ID)
  edges must have the columns N1, N2 and Value

原始CSV已包含这些列标题。我不确定我做错了什么。我是R的新手,所以如果我错过了明显的话,请耐心等待!

我的CSV文件上的dput如下所示:

structure(list(N1 = c("Cambridge", "Cambridge", "Cambridge", 
"Cambridge", "Cambridge", "South Cambs", "South Cambs", "South Cambs", 
"South Cambs", "South Cambs", "Rest of East", "Rest of East", 
"Rest of East", "Rest of East", "Rest of East", "Rest of UK", 
"Rest of UK", "Rest of UK", "Rest of UK", "Rest of UK", "Abroad", 
"Abroad", "Abroad", "Abroad", "Abroad"), N2 = c("Cambridge", 
"South Cambs", "Rest of East", "Rest of UK", "Abroad", "Cambridge", 
"South Cambs", "Rest of East", "Rest of UK", "Abroad", "Cambridge", 
"South Cambs", "Rest of East", "Rest of UK", "Abroad", "Cambridge", 
"South Cambs", "Rest of East", "Rest of UK", "Abroad", "Cambridge", 
"South Cambs", "Rest of East", "Rest of UK", "Abroad"), Value = c(106068L, 
1616L, 2779L, 13500L, 5670L, 2593L, 138263L, 2975L, 4742L, 1641L, 
2555L, 3433L, 0L, 0L, 0L, 6981L, 3802L, 0L, 0L, 0L, 5670L, 1641L, 
0L, 0L, 0L)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-25L), .Names = c("N1", "N2", "Value"), spec = structure(list(
    cols = structure(list(N1 = structure(list(), class = c("collector_character", 
    "collector")), N2 = structure(list(), class = c("collector_character", 
    "collector")), Value = structure(list(), class = c("collector_integer", 
    "collector"))), .Names = c("N1", "N2", "Value")), default = structure(list(), class = c("collector_guess", 
    "collector"))), .Names = c("cols", "default"), class = "col_spec"))

str(edges)给出:

Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   25 obs. of  3 variables:
 $ N1   : chr  "Cambridge" "Cambridge" "Cambridge" "Cambridge" ...
 $ N2   : chr  "Cambridge" "South Cambs" "Rest of East" "Rest of UK" ...
 $ Value: int  106068 1616 2779 13500 5670 2593 138263 2975 4742 1641 ...
 - attr(*, "spec")=List of 2
  ..$ cols   :List of 3
  .. ..$ N1   : list()
  .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
  .. ..$ N2   : list()
  .. .. ..- attr(*, "class")= chr  "collector_character" "collector"
  .. ..$ Value: list()
  .. .. ..- attr(*, "class")= chr  "collector_integer" "collector"
  ..$ default: list()
  .. ..- attr(*, "class")= chr  "collector_guess" "collector"
  ..- attr(*, "class")= chr "col_spec"

1 个答案:

答案 0 :(得分:0)

我认为问题是您遗漏了所需的ID列,因此混淆了命令。

edges = rep(my_data, col.names = c("N1","N2","Value"))
edges    <- data.frame(edges)
edges$ID <- 1:25

nodes = data.frame(ID = unique(c(edges$N1, edges$N2)))

river <- makeRiver(nodes, edges) 

上面的代码消除了错误消息。请注意,它会针对重复的边缘信息提出不相关的警告。

Warning message:
In checkedges(edges, nodes$ID) :
  duplicated edge information, removing 10 edges