我需要从两个基质开始绘制网络,代表产生的细菌物种和代谢物。我想要一个网络显示细菌和代谢物之间的共同发生和相关性(不是细菌和细菌和代谢物VS代谢物)。
以下是我的数据示例:
> mydata_bac
Taxon CD1 CD2
Actinomycetaceae;g__Actinomyces 0.072998825 0.031399459
Coriobacteriaceae;g__Atopobium 0.040946468 0.002703265
Corynebacteriaceae;g__Corynebacterium 0.002517201 0.006446247
Micrococcaceae;g__Rothia 0.001174694 0.002703265
Porphyromonadaceae;g__Porphyromonas 0.023326061 0.114368892
Prevotellaceae;g__Prevotella 0.252894781 0.102308172
Flavobacteriaceae;g__Capnocytophaga 0.001174694 0.029320025
Aerococcaceae;g__Abiotrophia 0.002013761 0.003327095
Carnobacteriaceae;g__Granulicatella 0.042960228 0.049490539
Gemellaceae;g__Gemella 0.027857023 0.067165731
Streptococcaceae;g__Streptococcus 0.220506796 0.182782283
ClostridialesFamilyXI.IncertaeSedis;g__ 0.000000000 0.000623830
ClostridialesFamilyXIII.IncertaeSedis;g__Mogibacterium 0.006880349 0.002495321
Lachnospiraceae;Other 0.000335627 0.000831774
Clostridia 0.004363148 0.002079434
Lachnospiraceae;g__Oribacterium 0.003524081 0.002079434
Peptostreptococcaceae;g__Peptostreptococcus 0.000167813 0.005198586
Veillonellaceae;Other 0.001342507 0.001455604
Veillonellaceae;g__Veillonella 0.047323376 0.082553545
Fusobacteriaceae;g__Fusobacterium 0.009229737 0.010813059
Fusobacteriaceae;g__Leptotrichia 0.092465179 0.076523186
Neisseriaceae;g__Neisseria 0.013592885 0.027656477
Pasteurellaceae;g__Haemophilus 0.014431952 0.092534831
SR1;c__;f__;g__ 0.000000000 0.002079434
TM7;c__TM7-3;f__;g__ 0.065782849 0.018299023
Erysipelotrichaceae;g__Bulleidia 0.007551603 0.004366812
Bacteroidia 0.000000000 0.000415887
Porphyromonadaceae;g__Tannerella 0.000671254 0.002079434
Flavobacteriaceae 0.002013761 0.001247661
Bacilli 0.002181574 0.002911208
Clostridia;f__;g__ 0.000671254 0.002703265
ClostridialesFamilyXIII.IncertaeSedis;g__Eubacterium 0.003020641 0.002079434
Lachnospiraceae;g__Moryella 0.003188454 0.000623830
Veillonellaceae;g__Selenomonas 0.004866588 0.021834061
Fusobacteriaceae 0.000335627 0.001871491
Campylobacteraceae;g__Campylobacter 0.001510321 0.001247661
Pasteurellaceae;g__Actinobacillus 0.002852828 0.000207943
Burkholderiaceae;g__Lautropia 0.000000000 0.002495321
Lactobacillaceae;g__Lactobacillus 0.000000000 0.000000000
Staphylococcaceae;g__Staphylococcus 0.000000000 0.000000000
> mydata_metab
Molecule CD1 CD2
1-chloro-decane 0.15 0.18
1-methyl-2-1-methylethyl-benzene 0.05 0.06
1-Octadecene 1.47 1.6
1.2.3-Trimethylbenzene 0.39 0.47
1.3-Bis1.1-dimethylethyl-benzene 0.24 0.06
1.3H-Isobenzofuranone 0.04 0.05
2-Butanone 0.01 0.03
2-ethyl-1-hexanol 1.44 1.34
2-Hexanol 0 0.02
2-Pentylfuran 0.42 0.56
2-pentylthiophene 8.08 7.21
25bis11dimethylethylphenol 0.15 0.27
2.6-dimethyl-4-heptanone 0.23 0
4-1.1-dimethylpropyl-phenol 0.39 0.3
4-1.1.3.3-tertamethylbutyl-phenol 1.6 2.04
4-Methoxybenzhydrol 0.02 0.02
4-methyl-1-pentene 0.02 0
4-methyl-2-hexanone 0.03 0.01
4-methyl-3-penten-2-one 0.67 0.27
aceticacidethylester 0.5 0.7
Benzene 0.13 0.1
Benzene-octyloxy 0.03 0.03
butanoicacid-2-methyloctylester 0.04 0.03
carbondisulfide 3.05 3.72
Caryophyllene 0.02 0.02
ethyl-alcohol 0.45 0.61
hexadecane 0.02 0.02
menthol 0.08 0.06
methyl-isobutyl-ketone 0.09 0.08
n.d.3-ion97 1.99 1.92
n.d.4-ion102 2.12 2.04
Nonadecane 0 0
Nonanal 0.82 0.71
Octanal 0.70 0.71
Pentadecane 0.03 0.03
phenol 0.12 0.09
Propyl-alcohol 0.20 0.22
Tetradecane 0.12 0.09
Toluene 0.25 0.15
Trichloromethane 0.15 0.12
这就是我的尝试:
> library(psych)
> library(igraph)
> mydata_bac <- read.csv(file="L5_filt.txt", header=T, row.names=1, sep="\t")
> mydata_metab <- read.csv(file="metab.txt", header=T, row.names=1, sep="\t")
> mydata_t <- t(as.matrix(mydata_bac))
> metab_t <- t(as.matrix(mydata_metab))
> cor.matrix <- corr.test(mydata_t,metab_t, method = "spearman")
> graph.f<-graph.adjacency(cor.matrix$r, weighted=TRUE, mode="upper")
> t.names <- colnames(cor.matrix)[as.numeric(V(graph.f)$name)]
> graph.f = simplify(graph.f)
> plot (graph.f, vertex.size=5, vertex.shape="circle", vertex.label.color="red", vertex.label=t.names, vertex.label.cex=0.9, edge.width=1, layout=layout.fruchterman.reingold)
但我有一个只有代谢物的网络...... 错误在哪里?
答案 0 :(得分:1)
好的,所以我想这里发生了什么,你所拥有的代谢物数量恰好与你所拥有的分类数相同。否则graph.adjacency
会抛出错误:它需要一个邻接的方阵,行 AND cor.matrix$r
的列应该包含图中的所有节点。
你的图表是bipartite:代谢物可以与分类群相连,分类群可以与代谢物相连,但代谢物从不与其他代谢物相连,而分类群从不与其他分类群相连。当您运行corr.test
时,您的行就是您的分类,而您的列就是您的代谢产品。
我不熟悉igraph
包:可能有比graph.adjacency
更合适的方法来处理二分图。代替这一点,我们可以强制我们的邻接矩阵cor.matrix
包含所有节点:
获得cor.matrix
后立即运行:
nNodes <- ncol(cor.matrix$r) + nrow(cor.matrix$r)
nodeNames <- c(colnames(cor.matrix$r), rownames(cor.matrix$r))
adj <- matrix(0, nNodes, nNodes, dimnames=list(nodeNames, nodeNames))
adj[(ncol(cor.matrix$r)+1):nrow(adj), 1:ncol(cor.matrix$r)] <- cor.matrix$r
adj[1:ncol(cor.matrix$r), (ncol(cor.matrix$r)+1):ncol(adj)] <- t(cor.matrix$r)
现在,您可以从graph.f
获取adj
,并运行其余代码:
graph.f <- graph.adjacency(adj, weighted=TRUE, mode="upper")