Question

试图在geom_col中绘制单词计数。

library(dplyr)
library(ggplot2)
library(tidytext)
df <- read_csv("C:/Data/Data.csv")

df %>% 
  count(word, sort=TRUE)
top_n(df, 15) %>%
  ggplot(df,mapping = aes(x = word, y = n)) +
  geom_col(fill="royalblue") +
  labs(x="Top Unique Words", y="Word Count")

csv文件包含两列数据：

users,word
user_gffast2,stop
user_gffast2,the
user_gffast2,along
user_gffast3,rain
user_gffast3,a
user_gffast3,the
user_gffast3,course
user_gffast4,stop
user_gffast4,the
user_gffast4,I
.
...etc.

我认为遇到麻烦的部分是

ggplot(df_task4,mapping = aes(x = word, y = n))

输出如下：

# A tibble: 912 x 2
   word       n
   <chr>  <int>
 1 the      244
 2 I        96
 3 and      90
 4 a         76
 5 from      72
 6 is        70
 7 to        68
 8 i         60
 9 this      55
10 for       50
#
# ... with 902 more rows
> top_n(df, 15) %>%
+   ggplot(df,mapping = aes(x = word, y = n)) +
+   geom_col(fill="royalblue") +
+   labs(x="Top Unique Words", y="Word Count")
Selecting by word
Don't know how to automatically pick scale for object of type gg/ggplot. Defaulting to continuous.
Error: Aesthetics must be either length 1 or the same as the data (25): y
>

我可以将y分配给n，还是将n视为纯变量？我需要以某种方式获取数据，计算所有单词，然后选择前15个单词并执行geom_col，其中前单词为x轴，总单词数为y轴。

尝试在geom_col上绘制字数统计。似乎无法弄清楚

0 个答案: