我有两个看起来像这样的数据框:
y1 <- c(1, 0, 0)
y2 <- c(0, 1, 0)
y3 <- c(0, 0, 1)
df1 <- data.frame(y1, y2, y3, row.names = c("x1", "x2", "x3"))
y1 <- c(1, 0, 0)
y2 <- c(1, 0, 0)
y3 <- c(1, 0, 0)
df2 <- data.frame(y1, y2, y3, row.names = c("z1", "z2", "z3"))
我想绘制这些数据框中的关系图,以便x,y和z值显示在连接它们的行的列中。这是我想要实现的一个粗略的例子:
我考虑过在X轴上使用带有分类变量的ggplot2散点图来生成列,但我无法弄清楚如何在这些点之间生成连接线。我还查看了ggnet2的网络图,但是找不到任何节点固定在列中的例子。
编辑:
我的真实用例大约有20个点,120个点和200个z点,因此理想情况下,解决方案可以相当容易地扩展。
我使用networkD3包中的sankeynetwork图
尝试了以下解决方案library(networkD3)
Nodes <- data.frame(name = c("x1", "x2", "x3", "y1", "y2", "y3", "z1",
"z2", "z3"), group = c("1", "1", "1", "2", "2", "2", "3", "3",
"3"))
Links <- data.frame(source = c(0, 1, 2, 3, 4, 5), target = c(3, 4, 5, 6,
6, 6), value = 1, 1, 1, 1, 1, 1)
sankeyNetwork(Links = Links, Nodes = Nodes, Source = "source",
Target = "target", Value = "value", NodeGroup = "group", NodeID
= "name", sinksRight = FALSE)
结果有点正确......但它可能并不理想。此外,似乎没有一种明确的方法可以强制z2和z3在页面右侧出现z1,而不会进入底层的javascript,我不知道该怎么做(参见d3 sankey charts - manually position node along x axis })
是否有更好的解决方案或改进此解决方案?
谢谢!
答案 0 :(得分:2)
以下是使用geom_segment
绘制连接边的一种可能解决方案。我不知道它比你的例子更大或更复杂的数据集的适应性。我怀疑使用igraph
或ggraph
处理此问题的方式更加优雅且可扩展。
# Start with two data.frames: one for node positions,
# and one for edges you want to draw between nodes.
pos_dat = data.frame(node_id=paste(rep(c("x", "y", "z"), each=3),
rep(c("1", "2", "3"), times=3),
sep=""),
type=rep(c("x", "y", "z"), each=3),
xpos=rep(c(1, 2, 3), each=3),
ypos=rep(c(1, 2, 3), times=3))
# node_id type xpos ypos
# 1 x1 x 1 1
# 2 x2 x 1 2
# 3 x3 x 1 3
# 4 y1 y 2 1
# 5 y2 y 2 2
# 6 y3 y 2 3
# 7 z1 z 3 1
# 8 z2 z 3 2
# 9 z3 z 3 3
edge_dat = data.frame(start=c("x1", "x2", "x3", "y1", "y2", "y3"),
end=c("y1", "y2", "y3", "z1", "z1", "z1"))
# start end
# 1 x1 y1
# 2 x2 y2
# 3 x3 y3
# 4 y1 z1
# 5 y2 z1
# 6 y3 z1
# Use two successive merges to join node x,y positions
# for each edge you want to draw.
tmp_dat = merge(edge_dat, pos_dat, by.x="start", by.y="node_id")
seg_dat = merge(tmp_dat, pos_dat, by.x="end", by.y="node_id")
# Remove unneeded columns and change column names for convenience.
seg_dat$type.x = NULL
seg_dat$type.y = NULL
names(seg_dat) = c("end", "start", "x", "y", "xend", "yend")
seg_dat
# end start x y xend yend
# 1 y1 x1 1 1 2 1
# 2 y2 x2 1 2 2 2
# 3 y3 x3 1 3 2 3
# 4 z1 y1 2 1 3 1
# 5 z1 y2 2 2 3 1
# 6 z1 y3 2 3 3 1
# Finally, draw the plot.
library(ggplot2)
p = ggplot() +
geom_segment(data=seg_dat, aes(x=x, y=y, xend=xend, yend=yend),
colour="grey50") +
geom_point(data=pos_dat, aes(x=xpos, y=ypos, colour=type), size=10) +
geom_text(data=pos_dat, aes(x=xpos, y=ypos, label=node_id)) +
scale_colour_manual(values=c(x="steelblue", y="darkorange", z="olivedrab3"))
ggsave("plot.png", plot=p, height=4, width=6, dpi=150)