在R中使用tdm上的igraph或R中的dtm绘制关键字/单词关联(findAssocs)吗?

时间:2015-11-01 23:11:52

标签: r plot igraph term-document-matrix

我想基于R中的某些term network analysis plot创建一个word associations,但我不知道如何超越绘制整个术语文档矩阵:

# Network analysis
library(igraph)
# load tdm data

# create matrix
Neg.m <- as.matrix(Ntdm_nonsparse)

# to boolean matrix
Neg.m[Neg.m>=1] <- 1

# to term adjacency matrix
# %*% is product of 2 matrices
Neg.m2 <- Neg.m %*% t(Neg.m)
Neg.m2[5:10,5:10]

# build graph with igraph ####
library(igraph)
# build adjacency graph
Neg.g <- graph.adjacency(Neg.m2, weighted=TRUE, mode="undirected")
# remove loops
Neg.g <- simplify(Neg.g)
# set labels and degrees of vertices
V(Neg.g)$label <- V(Neg.g)$name
V(Neg.g)$degree <- degree(Neg.g)

# plot layout fruchterman.reingold
layout1 <- layout.fruchterman.reingold(Neg.g)
plot(Neg.g, layout=layout1, vertex.size=20, 
     vertex.label.color="darkred")

是否仍然将单词关联network analysis plot应用于(以及一般单词关联bar plot)以下findAssocs数据?:

findAssocs(Ntdm, "verizon", .06)
$verizon
           att       switched         switch       wireless         basket         09mbps         16mbps 
          0.16           0.13           0.11           0.11           0.10           0.09           0.09 
        32mbps           4gbs           5gbs        cheaper            ima         landry          nudge 
          0.09           0.09           0.09           0.09           0.09           0.09           0.09 
         sears           wink      collapsed      expensive         sprint          -fine           -law 
          0.09           0.09           0.08           0.08           0.08           0.07           0.07 
         11yrs            380            980         alltel        callled         candle           cdma 
          0.07           0.07           0.07           0.07           0.07           0.07           0.07 
       concert    consequence    de-evolving          dimas          doria          fluke           left 
          0.07           0.07           0.07           0.07           0.07           0.07           0.07 
        london           lulz        lyingly           niet        outfits     pocketbook           puny 
          0.07           0.07           0.07           0.07           0.07           0.07           0.07 
     recentely         redraw    reinvesting      reservoir    satellite's         shrimp   stratosphere 
          0.07           0.07           0.07           0.07           0.07           0.07           0.07 
     strighten       switchig      switching        undergo     wheelchair wireless-never          worth 
          0.07           0.07           0.07           0.07           0.07           0.07           0.07 
          yeap           1994            299       cheapest           com'          comin        crushes 
          0.07           0.06           0.06           0.06           0.06           0.06           0.06 
  hhahahahahah          mache          metro      metro-nyc        must've         rising       sabotage 
          0.06           0.06           0.06           0.06           0.06           0.06           0.06 
wholeheartedly 
          0.06 

换句话说,我想想象一个特定关键字与R中其他关键字的关联,但我不知道如何。

1 个答案:

答案 0 :(得分:1)

按照?word_network_plot(包qdap)中的示例进行操作。