我努力生成带时间戳的事务的可视动态动画,其中每个事务代表一个人对工件/文件的贡献。为此,我使用的是R包networkDynamic
,network
和ndtv
。
交易(与networkDynamic
包装插图中的示例相比)具有“真实”时间戳。我想将渲染过程包装在
我想我已设法使用lubridate
s floor_date
在第一个事件的一周开始时创建第一个切片。我还没有看过最后一个问题(标签),因为不幸的是,我有麻烦确定我的数据集的正确切片参数。
请在下面找到RStudio的可重现示例。该示例包括三个名为slice.par
的列表,一个列表可以正常工作,另外两个列表没有。简单地硬编码(仅)与具体示例一起工作的参数配置不是我的目标,首先是因为我的真实数据集要大得多(因此“玩弄参数”会花费很多时间),其次因为我希望有一个适用于许多不同数据集的函数。
if (!require("pacman")) install.packages("pacman")
library("pacman")
pacman::p_load(network, networkDynamic, ndtv, lubridate)
UtilNumericAsDate <- function(nuUnixTimestamp) {
return(as.POSIXct(nuUnixTimestamp, origin = "1970-01-01 00:00.00 UTC", tz = "UTC"))
}
UtilDateAsNumeric <- function(oTimestamp) {
return(as.numeric(as.POSIXct(oTimestamp)))
}
stTransac <- "
'contributorId', 'artifactId', 'weight', 'instantId'
'A', 'a1', '1', '2003-06-01 23:09:40'
'A', 'a2', '1', '2004-02-27 11:48:41'
'A', 'a1', '2', '2006-06-25 20:36:49'
'A', 'a3', '1', '2007-01-28 00:35:31'
'A', 'a3', '2', '2007-04-25 16:03:57'
'A', 'a3', '3', '2007-07-19 19:43:49'
'B', 'a1', '1', '2008-02-06 12:37:56'
'C', 'a3', '1', '2008-04-07 02:27:36'
'C', 'a2', '1', '2008-06-01 02:15:35'
'C', 'a2', '2', '2008-10-05 02:32:45'
'B', 'a1', '2', '2009-06-22 01:57:45'
'C', 'a4', '1', '2009-09-15 02:56:33'
'C', 'a5', '1', '2010-06-30 19:42:25'
'C', 'a6', '1', '2011-06-12 23:58:17'
'B', 'a3', '1', '2013-08-30 19:34:28'
'C', 'a1', '1', '2014-10-23 20:49:54'
'C', 'a1', '2', '2014-10-24 16:46:07'
'A', 'a2', '2', '2015-09-26 16:58:17'
'A', 'a7', '1', '2015-10-04 17:40:12'
'A', 'a8', '1', '2015-12-02 10:55:47'
"
dfTransac <- read.csv(text = stTransac, sep = "," , quote = '\'' , strip.white = TRUE, stringsAsFactors = FALSE)
dfEdges <- unique(dfTransac[,1:2])
veUniqueContributors <- unique(dfEdges[[1]])
veUniqueArtifacts <- unique(dfEdges[[2]])
nuNrUniqueContributors <- length(veUniqueContributors)
nuNrUniqueArtifacts <- length(veUniqueArtifacts)
net <- network.initialize(0, directed = TRUE, bipartite = length(veUniqueContributors))
add.vertices.networkDynamic(net, nuNrUniqueContributors, vertex.pid = veUniqueContributors)
add.vertices.networkDynamic(net, nuNrUniqueArtifacts, vertex.pid = veUniqueArtifacts)
net %v% "vertex.names" <- c(veUniqueContributors, veUniqueArtifacts)
net %v% "vertex.type" <- c(rep("p", length(veUniqueContributors)), rep("a", length(veUniqueArtifacts)))
net %v% "vertex.col" <- c(rep("blue", length(veUniqueContributors)), rep("gray", length(veUniqueArtifacts)))
net %v% "vertex.sides" <- c(rep(8, length(veUniqueContributors)), rep(4, length(veUniqueArtifacts)))
net %v% "vertex.rot" <- c(rep(0, length(veUniqueContributors)), rep(45, length(veUniqueArtifacts)))
net %v% "vertex.lwd" <- c(rep(1, length(veUniqueContributors)), rep(0, length(veUniqueArtifacts)))
net %v% "vertex.cex" <- c(rep(2, length(veUniqueContributors)), rep(1, length(veUniqueArtifacts)))
set.network.attribute(net,'vertex.pid','vertex.names')
set.network.attribute(net,'edge.pid','edge.names')
add.edges.networkDynamic(net,
tail = get.vertex.id(net, dfEdges[[1]]),
head = get.vertex.id(net, dfEdges[[2]]),
edge.pid = paste0(dfEdges[[1]], "->", dfEdges[[2]]))
activate.edges(net,
e = get.edge.id(net, paste0(dfTransac[["contributorId"]], "->", dfTransac[["artifactId"]])),
at = UtilDateAsNumeric(dfTransac$instantId))
activate.edge.attribute(net,
prefix = "weight",
value = dfTransac$weight,
e = get.edge.id(net, paste0(dfTransac[["contributorId"]], "->", dfTransac[["artifactId"]])),
at = UtilDateAsNumeric(dfTransac$instantId))
reconcile.vertex.activity(net = net, mode = "encompass.edges", edge.active.default = FALSE)
nuStart <- range(get.change.times(net, ignore.inf = FALSE))[1]
nuEnd <- range(get.change.times(net, ignore.inf = FALSE))[2]
nuWeekStart <- UtilDateAsNumeric(floor_date(UtilNumericAsDate(nuStart), "week"))
nuWeekEnd <- UtilDateAsNumeric(ceiling_date(UtilNumericAsDate(nuEnd), "week"))
# This doesn't work: "Monthly" slices, 5 year aggregation
# Error: Attribute 'vertex.sides' had illegal missing values for vertex.sides or was not present in plot.network.default.
slice.par <- list(start = nuWeekStart,
end = nuWeekEnd,
interval = 1*60*60*24*7*4.5,
aggregate.dur = 1*60*60*24*7*52*5,
rule = "any")
# This doesn't work either: "Bimonthly" slices, "Bimonthly" aggregation
# Error: Attribute 'weight' had illegal missing values for edge.lwd or was not present in plot.network.default.
slice.par <- list(start = nuWeekStart,
end = nuWeekEnd,
interval = 1*60*60*24*7*4.5*2,
aggregate.dur = 1*60*60*24*7*4.5*2,
rule = "any")
# This works: "Bimonthly" slices, 5 year aggregation
slice.par <- list(start = nuWeekStart,
end = nuWeekEnd,
interval = 1*60*60*24*7*4.5*2,
aggregate.dur = 1*60*60*24*7*52*5,
rule = "any")
compute.animation(net, animation.mode = "kamadakawai", slice.par = slice.par, default.dist = 10)
render.d3movie(net,
slice.par = slice.par,
displaylabels = TRUE,
output.mode = "htmlWidget",
usearrows = TRUE,
vertex.col = 'vertex.col',
vertex.sides = 'vertex.sides',
vertex.cex = 'vertex.cex',
vertex.rot = 'vertex.rot',
edge.lwd = 'weight',
render.par = list(tween.frames = 10, show.time = TRUE))
如何从数据集中导出适当的切片参数,以便渲染过程不会在缺少属性或边缘的单个切片上窒息,而不会简单地增加聚合持续时间?
答案 0 :(得分:1)
正如您已经建立的那样,render.d3movie
函数中存在一个错误。它正在尝试查找“空”切片的值(包含无活动顶点的时间范围),请参阅bug report at Github。 (我实际上无法使用上面的代码重现错误,但这肯定是一个错误,感谢报告)
如何从数据集中导出适当的切片参数,以便渲染过程不会在缺少属性或边缘的单个切片上窒息,而不会简单地增加聚合持续时间?
直到错误修复(希望很快),你可以
a)使用render.animation
代替
b)选择切片参数以确保网络没有活动顶点的切片。您可以使用timeline
函数查看切片将落在何处。例如,要显示节点(蓝色)和边缘(紫色)以及切片箱(垂直灰色条)的活动法术:
timeline(net,slice.par=slice.par,main='timeline plot of activity spells')
c)最好的解决方案可能是调整顶点活动,以确保每个时间片中始终有一个活动顶点。在这种情况下,有些顶点被reconcile.vertex.activity
分配了非常短的持续时间,因为它们只包含非常短持续时间的边。使用不同的规则可能会避免这种情况,或者设置顶点一旦出现就始终处于活动状态(如果这对您的数据有意义)。
其他一些说明:
您可能还需要将slice.par$rule
值设置为earliest
而不是any
,以便在边缘上合并动态weight
属性时遇到多个可能的值它会知道选择哪一个。
networkDynamic
实用程序函数构建网络并传入stTransac
可能会有一种更紧凑的方式,并且在加载大型数据集时可能会更快。