尝试在geom_col上绘制字数统计。似乎无法弄清楚

时间:2019-11-24 16:15:08

标签: r

试图在geom_col中绘制单词计数。

library(dplyr)
library(ggplot2)
library(tidytext)
df <- read_csv("C:/Data/Data.csv")

df %>% 
  count(word, sort=TRUE)
top_n(df, 15) %>%
  ggplot(df,mapping = aes(x = word, y = n)) +
  geom_col(fill="royalblue") +
  labs(x="Top Unique Words", y="Word Count")

csv文件包含两列数据:

users,word
user_gffast2,stop
user_gffast2,the
user_gffast2,along
user_gffast3,rain
user_gffast3,a
user_gffast3,the
user_gffast3,course
user_gffast4,stop
user_gffast4,the
user_gffast4,I
.
...etc.

我认为遇到麻烦的部分是

ggplot(df_task4,mapping = aes(x = word, y = n))

输出如下:

# A tibble: 912 x 2
   word       n
   <chr>  <int>
 1 the      244
 2 I        96
 3 and      90
 4 a         76
 5 from      72
 6 is        70
 7 to        68
 8 i         60
 9 this      55
10 for       50
#
# ... with 902 more rows
> top_n(df, 15) %>%
+   ggplot(df,mapping = aes(x = word, y = n)) +
+   geom_col(fill="royalblue") +
+   labs(x="Top Unique Words", y="Word Count")
Selecting by word
Don't know how to automatically pick scale for object of type gg/ggplot. Defaulting to continuous.
Error: Aesthetics must be either length 1 or the same as the data (25): y
>

我可以将y分配给n,还是将n视为纯变量?我需要以某种方式获取数据,计算所有单词,然后选择前15个单词并执行geom_col,其中前单词为x轴,总单词数为y轴。

0 个答案:

没有答案