Question

我有来自Facebook的数据框，如下所示：

> names(postsBrands)
 [1] "from_id"        "from_name"      "message"        "created_time"   "type"          
 [6] "link"           "id"             "likes_count"    "comments_count" "shares_count"  
 [11] "pages.likes"    "like.ratio"

我已经按照like.ratio变量的函数按降序排序，这是一个连续变量。

> head(postsBrands$like.ratio, 5)
[1] 2.0874 1.7139 1.4141 1.2978 1.2148

我想将like.ratio绘制为id的函数（这是每个观察的唯一值），因此我可以看到like.ratio分布。

id是一个分类变量，具有如下随机唯一值：

> head(postsBrands$id, 5)
[1] "227015499643_10153008578079644"  "123538507674613_843197669042023" "227015499643_10152811559719644" 
[4] "227015499643_10153032875784644"  "123538507674613_978571692171286"

我使用此代码创建密度图并且不起作用：

ggplot(data = postsBrands, aes(x = id, y = like.ratio)) + geom_density()

然后我尝试创建一个折线图并且也没有工作：

ggplot(data = postsBrands, aes(x = id, y = like.ratio)) + geom_line()

但是我使用了geom_plot（）图并返回了这个：

ggplot(data = postsBrands, aes(x = id, y = like.ratio)) + geom_point()

Scatterplot

所以我认为这个问题与情节中id的排序有关，因此ggplot不能按我的意愿绘制like.ratio。

数据可在此处获取：

Data frame converted to CSV

ggplot没有显示连续变量分布

0 个答案: