Question

我在这样的文件中阅读：

genes<-read.table("goi.txt",header=TRUE, row.names=1)
control<-log2(1+(genes[,1]))
experiment<-log2(1+(genes[,2]))

将它们绘制为ggplot中的简单散点图：

ggplot(genes, aes(control, experiment)) +
    xlim(0, 20) + 
    ylim(0, 20) +
    geom_text(aes(control, experiment, label=row.names(genes)),size=3)

然而，这些点不正确地放在我的情节上（见附图）

这是我的数据：

          control     expt
gfi1     0.189634  3.16574
Ripply3 13.752000 34.40630
atonal   2.527670  4.97132
sox2    16.584300 42.73240
tbx15    0.878446  3.13560
hes8     0.830370  8.17272
Tlx1     1.349330  7.33417
pou4f1   3.763400  9.44845
pou3f2   0.444326  2.92796
neurog1 13.943800 24.83100
sox3    17.275700 26.49240
isl2     3.841100 10.08640

正如你所看到的，＆＃39; Ripply3＆＃39;显然在图表上的位置错误！

我做的事真的很蠢吗？

enter image description here

Answer 1

aes()使用的ggplot函数首先查看您通过data = genes提供的数据框内。这就是为什么你可以（并且应该）仅通过像control这样的裸列名来指定变量的原因; ggplot会自动知道在哪里找到数据。

但是R的范围系统是如此，如果在当前环境中找不到该名称，R将查看父环境，依此类推，直到它到达全局环境，直到找到该名称的内容。

因此aes(control, experiment)会在数据框 control中查找名为experiment和genes 的变量。它找到原始的，未转换的control变量，但当然experiment中没有genes变量。因此，它继续向上环境，直到它到达全球环境，在那里你定义了孤立变量experiment并使用它。

你打算做更像这样的事情：

genes$controlLog <- log2(1+(genes[,1])) genese$exptLog <- log2(1+(genes[,2]))

接下来是：

ggplot(genes, aes(controlLog, exptLog)) + xlim(0, 20) + ylim(0, 20) + geom_text(aes(controlLog, exptLog, label=row.names(genes)),size=3)

ggplot中错位的点

1 个答案: