Question

我试图从与词汇有关的数据Vocabulary.txt教育中进行绘图。

这是我使用的代码

plot(jitter(education)~jitter(vocabulary),pch=23,xlim=c(0,30),ylim=c(0,30))

我的图片看起来像 enter image description here

看起来不对，也许有人可以向我解释我做错了什么，进一步说明命令jitter到底是做什么的？

Answer 1

我认为下面的两个输出是〜pub-ready。

第一个使用基数R和jitter，用于向数据添加一些噪声，以便具有相同坐标的点出现在不同的位置。在这种情况下，这是一个很好的方法（假设您稍微修改了数据，请提及抖动）。如果你有很多分数，你可以将这种方法与一些透明度相结合。

首先，我们制作示例reproducible：

df <- read.table("http://socserv.socsci.mcmaster.ca/jfox/Books/Applied-Regression-3E/datasets/Vocabulary.txt", header=TRUE)
plot(jitter(education)~jitter(vocabulary), df, pch=20, col="#00000011",
     xlim=range(vocabulary), ylim=range(education),
     xlab="vocabulary", ylab="education")

enter image description here

但从根本上说，你可能正试图用偶然性表格绘制，所以第二个，使用ggplot2：

library(ggplot2)
# creates a contingency table
tab.df <- as.data.frame(with(df, table(education, vocabulary)))
ggplot(tab.df) + aes(x=vocabulary, y=education, fill=Freq, label=Freq) + 
# colored tiles and labels (0s are omitted)
geom_tile() + geom_text(data=subset(tab.df, subset = Freq != 0), size=2) +
# cosmectics
scale_fill_gradient(low="white", high="red") + theme_linedraw()

enter image description here

绘制百分比（瓷砖和标签）可能是更好的选择，但您的问题对您的目标含糊不清。如果你想要第一个情节，但是ala ggplot2你仍然可以解决：

ggplot(df) + aes(x=education, y=vocabulary) + geom_jitter(alpha=0.05)

enter image description here

R项目图形图

1 个答案: