使用多级数据圈出和弦图

时间:2016-08-28 06:59:12

标签: chord-diagram circlize

我发现自己有点陷入困境, 我想通过circlize上的和弦图显示被贩运物种的区域之间的流量,但是当第1列和第2列代表“连接”时,我无法弄清楚如何绘制,第3列是感兴趣的“因子”,第4列是价值。 我在下面列出了一些数据样本(是的,我知道印度尼西亚是一个地区),因为你可以看到每个物种并不是特定地区的独特物种。我想制作一个类似于下面的情节,但用“物种”代替每个地区的“国家”。这可能吗?

import_region    export_region  species                flow
North America    Europe         Acanthosaura armata     0.0104
Southeast Asia   Europe         Acanthosaura armata     0.0022
Indonesia        Europe         Acanthosaura armata     0.1971
Indonesia        Europe         Acrochordus granulatus  0.7846
Southeast Asia   Europe         Acrochordus granulatus  0.1101
Indonesia        Europe         Acrochordus javanicus   2.00E-04
Southeast Asia   Europe         Acrochordus javanicus   0.0015
Indonesia        North America  Acrochordus javanicus   0.0024
East Asia        Europe         Acrochordus javanicus   0.0028
Indonesia        Europe         Ahaetulla prasina       4.00E-04
Southeast Asia   Europe         Ahaetulla prasina       4.00E-04
Southeast Asia   East Asia      Amyda cartilaginea      0.0027
Indonesia        East Asia      Amyda cartilaginea      5.00E-04
Indonesia        Europe         Amyda cartilaginea      0.004
Indonesia        Southeast Asia Amyda cartilaginea      0.0334
Europe           North America  Amyda cartilaginea      4.00E-04
Indonesia        North America  Amyda cartilaginea      0.1291
Southeast Asia   Southeast Asia Amyda cartilaginea      0.0283
Indonesia        West Asia      Amyda cartilaginea      0.7614
South Asia       Europe         Amyda cartilaginea      2.8484
Australasia      Europe         Apodora papuana         0.0368
Indonesia        North America  Apodora papuana         0.324
Indonesia        Europe         Apodora papuana         0.0691
Europe           Europe         Apodora papuana         0.0106
Indonesia        East Asia      Apodora papuana         0.0129
Europe           North America  Apodora papuana         0.0034
East Asia        East Asia      Apodora papuana         2.00E-04
Indonesia        Southeast Asia Apodora papuana         0.0045
East Asia        North America  Apodora papuans         0.0042

类似于我想要的图表示例,请点击以下链接: chord diagram

1 个答案:

答案 0 :(得分:3)

在circlize包中,ChordDiagram()函数只允许“from”列,“to”列和可选的“value”列。但是,在您的情况下,实际上我们可以对原始数据帧进行一些转换,以将其修改为三列数据帧。

在您的示例中,您要区分,例如来自欧洲Acanthosaura_armata的北美Acanthosaura_armata,一种解决方案是合并区域名称和物种名称,例如Acanthosaura_armata|North_America,以形成唯一标识符。接下来,我将演示如何通过circlize包可视化此数据集。

读入数据。注意我用下划线替换了空格。

df = read.table(textConnection(
"import_region    export_region  species                flow
North_America    Europe         Acanthosaura_armata     0.0104
Southeast_Asia   Europe         Acanthosaura_armata     0.0022
Indonesia        Europe         Acanthosaura_armata     0.1971
Indonesia        Europe         Acrochordus_granulatus  0.7846
Southeast_Asia   Europe         Acrochordus_granulatus  0.1101
Indonesia        Europe         Acrochordus_javanicus   2.00E-04
Southeast_Asia   Europe         Acrochordus_javanicus   0.0015
Indonesia        North_America  Acrochordus_javanicus   0.0024
East_Asia        Europe         Acrochordus_javanicus   0.0028
Indonesia        Europe         Ahaetulla_prasina       4.00E-04
Southeast_Asia   Europe         Ahaetulla_prasina       4.00E-04
Southeast_Asia   East_Asia      Amyda_cartilaginea      0.0027
Indonesia        East_Asia      Amyda_cartilaginea      5.00E-04
Indonesia        Europe         Amyda_cartilaginea      0.004
Indonesia        Southeast_Asia Amyda_cartilaginea      0.0334
Europe           North_America  Amyda_cartilaginea      4.00E-04
Indonesia        North_America  Amyda_cartilaginea      0.1291
Southeast_Asia   Southeast_Asia Amyda_cartilaginea      0.0283
Indonesia        West_Asia      Amyda_cartilaginea      0.7614
South_Asia       Europe         Amyda_cartilaginea      2.8484
Australasia      Europe         Apodora_papuana         0.0368
Indonesia        North_America  Apodora_papuana         0.324
Indonesia        Europe         Apodora_papuana         0.0691
Europe           Europe         Apodora_papuana         0.0106
Indonesia        East_Asia      Apodora_papuana         0.0129
Europe           North_America  Apodora_papuana         0.0034
East_Asia        East_Asia      Apodora_papuana         2.00E-04
Indonesia        Southeast_Asia Apodora_papuana         0.0045
East_Asia        North_America  Apodora_papuans         0.0042"),
header = TRUE, stringsAsFactors = FALSE)

此外,我删除了一些具有非常小值的行。

df = df[df[[4]] > 0.01, ]

为物种和区域指定颜色。

library(circlize)
library(RColorBrewer)
all_species = unique(df[[3]])
color_species = structure(brewer.pal(length(all_species), "Set1"), names = all_species)
all_regions = unique(c(df[[1]], df[[2]]))
color_regions = structure(brewer.pal(length(all_regions), "Set2"), names = all_regions)

按物种分组

首先,我将演示如何按物种对和弦图进行分组。

如前所述,我们使用species|region作为唯一标识符。

df2 = data.frame(from = paste(df[[3]], df[[1]], sep = "|"),
                 to = paste(df[[3]], df[[2]], sep = "|"),
                 value = df[[4]], stringsAsFactors = FALSE)

接下来,我们按物种调整所有行业的顺序,然后按地区调整。

combined = unique(data.frame(regions = c(df[[1]], df[[2]]), 
    species = c(df[[3]], df[[3]]), stringsAsFactors = FALSE))
combined = combined[order(combined$species, combined$regions), ]
order = paste(combined$species, combined$regions, sep = "|")

我们希望链接的颜色与regoins的颜色相同

grid.col = structure(color_regions[combined$regions], names = order)

由于弦图按物种分组,物种之间的间隙应大于每个物种内部。

gap = rep(1, length(order))
gap[which(!duplicated(combined$species, fromLast = TRUE))] = 5

准备好所有设置后,我们现在可以制作和弦图:

在下面的代码中,我们设置preAllocateTracks,以便之后添加代表物种的圆形线。

circos.par(gap.degree = gap)
chordDiagram(df2, order = order, annotationTrack = c("grid", "axis"),
    grid.col = grid.col, directional = TRUE,
    preAllocateTracks = list(
        track.height = 0.04,
        track.margin = c(0.05, 0)
    )
)

添加圆形线代表物种:

for(species in unique(combined$species)) {
    l = combined$species == species
    sn = paste(combined$species[l], combined$regions[l], sep = "|")
    highlight.sector(sn, track.index = 1, col = color_species[species], 
        text = species, niceFacing = TRUE)
}
circos.clear()

地区和物种的传说:

legend("bottomleft", pch = 15, col = color_regions, 
    legend = names(color_regions), cex = 0.6)
legend("bottomright", pch = 15, col = color_species, 
    legend = names(color_species), cex = 0.6)

情节如下:

group_by_species

按地区分组

代码类似,我不会解释它,只是在帖子中附上代码。情节看起来像这样:

group_by_regions

## group by regions
df2 = data.frame(from = paste(df[[1]], df[[3]], sep = "|"),
                 to = paste(df[[2]], df[[3]], sep = "|"),
                 value = df[[4]], stringsAsFactors = FALSE)

combined = unique(data.frame(regions = c(df[[1]], df[[2]]), 
    species = c(df[[3]], df[[3]]), stringsAsFactors = FALSE))
combined = combined[order(combined$regions, combined$species), ]
order = paste(combined$regions, combined$species, sep = "|")
grid.col = structure(color_species[combined$species], names = order)

gap = rep(1, length(order))
gap[which(!duplicated(combined$species, fromLast = TRUE))] = 5

circos.par(gap.degree = gap)
chordDiagram(df2, order = order, annotationTrack = c("grid", "axis"),
    grid.col = grid.col, directional = TRUE,
    preAllocateTracks = list(
        track.height = 0.04,
        track.margin = c(0.05, 0)
    )
)
for(region in unique(combined$regions)) {
    l = combined$regions == region
    sn = paste(combined$regions[l], combined$species[l], sep = "|")
    highlight.sector(sn, track.index = 1, col = color_regions[region], 
        text = region, niceFacing = TRUE)
}
circos.clear()

legend("bottomleft", pch = 15, col = color_regions, 
    legend = names(color_regions), cex = 0.6)
legend("bottomright", pch = 15, col = color_species, l
    egend = names(color_species), cex = 0.6)