如何在r中的comparison.cloud函数中设置文字云的顺序?

时间:2018-05-26 19:07:54

标签: r plot comparison word-cloud

我的df_all_cloud看起来像下面(我省略了行,因为它超过了9,000行。)我试图使用函数绘制comparison.cloud但是wordcloud的顺序不是我想绘制的而且我想要解决它。

> df_all_cloud
                        RustSwing  RustNonSwing  NonRustSwing NonRustNonSwing
---                 5.656140e-189 8.507755e-190  0.000000e+00   5.896819e-234
--job                6.562524e+01 3.120469e-207  0.000000e+00   1.424488e-182
--young             5.025230e-167 5.965918e-231  0.000000e+00   4.230202e-261
-agallon            3.812178e-260  0.000000e+00  0.000000e+00    0.000000e+00
-anyth              3.420482e-288  0.000000e+00  0.000000e+00   2.253639e-269
-bid                 5.405453e+01 1.958225e-244  0.000000e+00   1.008145e-157
-call               3.812291e-180 7.005123e-169  1.821772e+01    1.615938e+02
-compet             1.402988e-208 2.679841e-252 6.754788e-313   3.976833e-153
-counti             5.407789e-271  0.000000e+00  0.000000e+00    0.000000e+00
-educ               2.593258e-225  0.000000e+00  0.000000e+00   2.194225e-227
-first              4.035172e-292  0.000000e+00  0.000000e+00    0.000000e+00
-hands--deck        9.869273e-182  0.000000e+00  0.000000e+00    0.000000e+00
-noth               3.852713e-262  1.750845e+02  0.000000e+00    0.000000e+00
-person             1.066348e-279  0.000000e+00  0.000000e+00   3.825194e-219
-profit             2.531847e-209  0.000000e+00  0.000000e+00    1.599742e-56
-risk               8.216369e-258 2.156436e-311  0.000000e+00   8.955837e-181
-school             1.397991e-247  0.000000e+00  0.000000e+00    0.000000e+00
-tax                7.680196e-240  0.000000e+00  0.000000e+00    0.000000e+00
-teach              1.208790e-234 2.679841e-252  0.000000e+00   5.611962e-289
-time               1.007829e-223 3.496924e-268 3.589878e-300    1.983659e-53
-year                5.850020e-16 2.098164e-163  3.602081e+01    2.728736e-99
-year-old           4.344255e-174 1.340573e-182  2.299337e-33   1.466326e-179
ââand               6.325284e-125 7.513159e-231  0.000000e+00   3.182851e-281
aâit                2.179026e-170  0.000000e+00  0.000000e+00   1.198857e-211
âand                9.424852e-115  0.000000e+00  0.000000e+00   1.393548e-234
abandon             1.295808e-229 3.477344e-183  7.437344e+01    2.573337e+02
abet                8.774570e-268  0.000000e+00  0.000000e+00    0.000000e+00
abid                 3.282298e+01  1.683042e-32 7.746201e-285    1.485670e-31
abil                 6.478317e-89  8.772561e+01 2.161801e-135    1.510418e+02
abject               8.304884e-36  0.000000e+00  0.000000e+00    0.000000e+00
abl                  1.628166e+02  6.417090e-80  5.391225e+01    9.292213e+01
abolish             9.635299e-315 1.067836e-101 1.886981e-274    5.767243e+01
abolitionist        2.442372e-209  0.000000e+00  0.000000e+00   2.263247e-223
abomin              8.081082e-252 1.316162e-266  0.000000e+00   1.984399e-235
abraham             3.031553e-260 3.747939e-204  5.448539e+01    7.281124e-08
abroad               4.419493e+01  1.758249e+02 2.030860e-265    8.407188e+02
absent              5.980674e-212  0.000000e+00  0.000000e+00    5.725856e+01

我编写的代码如下:

comparison.cloud(df_all_cloud, random.order=FALSE, 
             colors = c("hotpink2","orange2","limegreen","lightslateblue"),
             title.size=1.5, max.words=200)

情节如下图所示,

enter image description here

我希望wordcloud的顺序是

RustSwing / RustNonSwing

NonRustSwing / NonRustNonSwing

在绘制comparison.cloud时,是否有任何可以更改的选项?

1 个答案:

答案 0 :(得分:0)

这取决于class() df_all_cloud是什么。我只是使用以下命令重新排序带有行名的data.frame列:

 library(dplyr)

 samp2 <- rownames(df_all_cloud) #grab rownames

 df_all_cloud <- df_all_cloud %>%
    select("RustNonSwing",
           "RustSwing",
           "NonRustSwing",
           "NonRustNonSwing") %>% #Reorder columns/remove rownames
    mutate_if(function(x) is.factor(x), 
              funs(as.numeric(as.factor(.))-1)) #Make sure things are numeric

 rownames(df_all_cloud) <- samp2 #Add the rownames back

 library(wordcloud)

 comparison.cloud(df_all_cloud, random.order=FALSE, 
             colors = c("hotpink2","orange2","limegreen","lightslateblue"),
             title.size=1.5, max.words=200)

output