ggplot2名称空间问题

时间:2013-03-20 15:25:52

标签: r ggplot2

所以我正在阅读this paper,关于如何回归x y并没有给你一个等价线来回归y~x(长话短说,你在每种情况下最小化不同的残差),我想我会看看我的数据是否有任何不同。不幸的是,我无法获得ggplot2的stat_smooth来回归y上的x。

这是我数据中100行的示例:

> dput(df)
structure(list(Sample1 = c(4.98960685567417, 3.47204999768371, 
-1.02623042506767, -3.32192809488736, 4.92376518190495, 5.18057364383907, 
3.37344268231229, 7.17221940674516, -1.53797027495265, 4.59078735278587, 
-3.32192809488736, 4.13323387842297, -3.32192809488736, 1.55320685650568, 
3.45928530433997, 6.52941436696241, 0.291721646202104, 1.34692630168789, 
2.588263571484, 7.93984265473926, -1.02623042506767, -3.32192809488736, 
5.97393529545249, 4.96272548511184, -3.32192809488736, 3.19245860152084, 
-0.181177668537811, 1.1702151401323, 7.19741957077731, -3.32192809488736, 
3.09679962523049, 6.89612648849508, 4.49710160528899, 2.90303096540262, 
2.44022334122376, 4.53409217014818, 4.7590708357255, 4.72481633684295, 
0.691615956272056, 3.49724548154186, 3.31795986458613, 5.83755830779617, 
4.36304299927157, -0.0295430166578704, 5.02251917128349, 7.38120720286448, 
2.00357119131309, 3.28216732938001, 4.78231786359764, -0.764299371525642, 
3.92619319211129, 0.0406883788967327, 2.85511116340049, 5.66764040870341, 
1.5770354481397, 6.94329764296121, 3.9308239558015, 5.74724483501107, 
5.88247371232242, 3.60547757169518, -3.32192809488736, -0.542695575918393, 
5.28191680995643, 2.77496647475118, -1.34650507208309, -2.74783703944563, 
6.41591061993452, 3.7539739823111, 2.49127670210767, 6.65016370221803, 
6.32032425410637, -3.32192809488736, 0.402556327534611, 2.34628748804301, 
0.89488462084739, 1.1702151401323, -3.32192809488736, 4.72748036766425, 
5.61681844161155, 2.03831731682591, -0.443475473403791, -2.74783703944563, 
-3.32192809488736, 4.10482914538061, 5.3663866285164, 2.15373368651261, 
-3.32192809488736, 5.45745992658215, -0.764299371525642, 1.69060972560112, 
0.171658206055067, 4.46553470954628, 4.93993527938277, 2.23086351589855, 
0.2329385503416, 6.14110973873979, -0.443475473403791, -3.32192809488736, 
4.16502606126426, 5.45906373012262), Sample2 = c(4.87035389289972, 
3.05631305243025, 0.82323632916493, -2.4296633454796, 4.91811337313532, 
5.03007694504678, 3.33859900165761, 6.54820168349871, -2.4296633454796, 
4.86188407042978, -3.32192809488736, 4.62717002820498, -1.67106299528127, 
-3.32192809488736, 2.98790308889454, 6.05226752694174, 0.596977520519267, 
1.99826710332906, 2.23966464285132, 8.01717854320643, -0.00159952300352469, 
-1.48670630366969, 5.43594909921552, 4.95640818811881, -3.32192809488736, 
4.90168009665292, 0.555561013584398, 2.27836717662796, 6.47511546712932, 
-3.32192809488736, 3.01870866264235, 2.08814992134377, 3.65522335205242, 
3.36281234142216, 1.68917367193269, 3.90649842473784, 4.95242469001903, 
4.78780929337046, 3.21768687367553, 3.85621590296538, 3.41005295130021, 
5.53250156756501, 3.7175760684724, 0.676405696816667, 5.14971539956163, 
6.98398467911331, 3.3567969981478, 3.88157621603906, 4.90168009665292, 
-0.0647853872737311, 4.55018561922281, 1.21774415322831, 2.60706104004096, 
5.87421631057342, 2.22652958465421, 7.02865467720292, 3.39838702576746, 
5.19424993242169, 5.32786325515541, 2.67632233586628, -2.80788789937207, 
0.172877949731557, 4.84264318258294, 2.7144567398541, -0.00159952300352469, 
-1.17646253367489, 6.33756586506636, 3.68433673855881, 2.62719159116047, 
6.51076754571708, 6.31214443849056, -1.88247779017497, 0.714537926272974, 
3.17045554987033, 0.32851270527504, 9.05953630113146, -2.80788789937207, 
4.36480327392371, 5.88366056411048, 1.63054822649657, 2.01364185790907, 
-1.04323142745032, -2.80788789937207, 2.88298417549747, 5.60661313436485, 
1.9024101424183, -0.516096487844633, 5.42017137768702, -0.921270583844273, 
2.36482913745982, 2.46888455306976, 4.55807157102875, 5.01105219152827, 
1.19094552584428, 0.987970258058337, 5.55106017924769, 2.15899824583894, 
-3.32192809488736, 3.92287721859118, 5.06185416274169)), .Names = c("Sample1", 
"Sample2"), class = "data.frame", row.names = c(NA, -100L))

当我尝试使用ggplot2绘图时,我可以轻松地将回归y放在x上的情节中:

> ggplot(df, aes(x=Sample1,y=Sample2)) + geom_point() + stat_smooth(formula = y~x, method = "lm")

但如果我尝试在公式中使用其他任何东西,我就会找不到对象错误:

> ggplot(df, aes(x=Sample1,y=Sample2)) + geom_point() + stat_smooth(formula = y~x, method = "lm") +  stat_smooth(formula = x~y, method = "lm")
Error in eval(expr, envir, enclos) : object 'y' not found

我尝试使用原始列名称(Sample1和Sample2)而不是x和y,并为stat_smooth提供了自己的aes,都给出了相同的错误。

我认为这是一个名称空间错误:公式正在以某种方式在错误的环境中进行评估,但我不知道为什么一个可以工作而不是另一个,或者如何解决问题。

任何想法都赞赏。

2 个答案:

答案 0 :(得分:2)

解决方法是在更顺畅的通话中使用其他aes()来电切换xy

ggplot(df, aes(x=Sample1,y=Sample2)) + 
    geom_point() + 
    stat_smooth(method="lm") + 
    stat_smooth(aes(x=Sample2,y=Sample1), method="lm")

答案 1 :(得分:2)

你需要考虑{I}在你想要绘制的情节的背景下做了什么。

错误很可能是因为在公式解析代码中评估stat_smooth(formula = x~y, method = "lm")时,在formula右侧命名的变量(错误情况下为~)是 ggplot 内部定义的数据集中的对应于绘图的x轴。 y必须隔离ggplot变量数据,并且只查找该数据对象中的变量。

现在回想一下你的情节。您在x轴上绘制了x,在y轴上绘制了xy的线性模型在相反的“轴”上具有数据向量。所以从语法的角度来看,你给出的定义没有意义。我想我想说的是,公式的右侧只能包含x ~ yx的函数,因为这是内部可用的所有内容。

如果您想这样做,请自己添加x的回归线,例如:

x ~ y

您可以使用mod <- lm(Sample1 ~ Sample2, data = df) pdat <- with(df, data.frame(Sample2= seq(min(Sample2), max(Sample2), length = 100))) pdat <- transform(pdat, Sample1 = predict(mod, newdata = pdat)) ggplot(df, aes(x=Sample1,y=Sample2)) + geom_point() + stat_smooth(formula = y~x, method = "lm") + geom_line(data = pdat, aes(x = Sample1, y = Sample2)) 添加置信区间,并使用geom_ribbon()参数interval = "confidence"进行计算。你需要修改

predict.lm

以适应这一点,但这仍然是读者的练习。