我需要代表一系列社交网络指标之前/之后的变化。这个想法是,每个点都由x坐标和y坐标组成,其中x坐标是一种社会角色的平均值,而线代表标准偏差。
例如:在“之前”的那一刻,我们有4个“公共机构”类型的社会参与者,而在“之后”的那一刻,我们有6个参与者(有些是相同的,而另一些是新的,但这并不重要,因为我试图从结构而不是从节点进行描述。从该样本中得出平均值和偏差,而我希望与该图进行比较的是那些在不同度量标准中“增加”或“减少”的人。
当前,我的数据库看起来像这样(建议更改它,但我认为可以用这种方式进行处理)。
time category code Clossenness
1 PI PI1 0,658
1 PI PI2 0,568
1 PI PI3 0,581
1 PI PI4 0,595
1 PI PI5 0,556
1 PrI PrI1 0,658
1 PrI PrI2 0,543
1 NGO's NGO1 0,568
1 NGO's NGO2 0,581
2 PI PI1 0,611
2 PI PI6 0,600
2 PI PI7 0,485
2 PI PI8 0,569
2 PI PI9 0,579
2 PI PI10 0,635
2 PI PI11 0,623
2 PI PI12 0,623
2 PI PI13 0,673
2 PrI PrI1 0,673
2 PrI PrI3 0,600
2 NGO's NGO1 0,750
2 NGO's NGO3 0,508
2 NGO's NGO4 0,524
structure(list(structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("1",
"2"), class = "factor"), timecategory = structure(c(2L, 2L, 2L,
2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
3L, 1L, 1L, 1L), .Label = c("NGO's", "PI", "PrI"), class = "factor"),
code = structure(c(5L, 10L, 11L, 12L, 13L, 18L, 19L, 1L,
2L, 5L, 14L, 15L, 16L, 17L, 6L, 7L, 8L, 9L, 18L, 20L, 1L,
3L, 4L), .Label = c("NGO1", "NGO2", "NGO3", "NGO4", "PI1",
"PI10", "PI11", "PI12", "PI13", "PI2", "PI3", "PI4", "PI5",
"PI6", "PI7", "PI8", "PI9", "PrI1", "PrI2", "PrI3"), class = "factor"),
Clossenness = structure(c(15L, 6L, 9L, 10L, 5L, 15L, 4L,
6L, 9L, 12L, 11L, 1L, 7L, 8L, 14L, 13L, 13L, 16L, 16L, 11L,
17L, 2L, 3L), .Label = c("0,485", "0,508", "0,524", "0,543",
"0,556", "0,568", "0,569", "0,579", "0,581", "0,595", "0,600",
"0,611", "0,623", "0,635", "0,658", "0,673", "0,750"), class = "factor")), .Names = c("",
"time category", "code", "Clossenness"), row.names = c(NA, -23L
), class = "data.frame")
箱形图以描述性的方式表示我需要的信息,但是比较之前/之后的更改变得更加困难,因为您必须成对查看箱形图。然后,我发现使用我建议的其他图形更为合适。困难在于我不知道用相同的信息制作该图的直接方法。
预期结果 https://ibb.co/WsrDN7D 实际结果 https://ibb.co/M6QWXLv
答案 0 :(得分:1)
使用函数group_by()
和summarise()
,可以每次计算每个类别的平均值,而使用函数spread()
,可以将这两个值重新组合在同一行上:
set.seed(1)
df <- data.frame(
time = rep(c('before', 'after'), each = 8),
category = rep(rep(c('PI', 'NGO'), each = 4), times = 2),
clossenness = rnorm(16, .6, .1)
) %>%
group_by(time, category) %>%
summarise(mean_clos = mean(clossenness)) %>%
spread(key = time, value = mean_clos)
category after before
<fct> <dbl> <dbl>
1 NGO 0.630 0.595
2 PI 0.573 0.659
然后,您可以使用函数geom_label()
或geom_point()
绘制该点(之前,之后),并将其与身份线进行比较,以查看它是增加还是减少。
df %>%
ggplot(aes(x = before, y = after)) +
#geom_point() +
geom_label(aes(label = category)) +
geom_abline(intercept = 0, slope = 1) +
xlim(c(.5, .7)) + ylim(c(.5, .7))
在此示例中,NGO增加,而PI减少。