使用rpy2

时间:2016-08-10 14:29:13

标签: python r pandas rpy2

首先看看rpy2是否正常工作我运行了一个简单的模型(stats.lm):

import pandas as pd
from rpy2 import robjects as ro
from rpy2.robjects import pandas2ri
pandas2ri.activate()
from rpy2.robjects.packages import importr
stats = importr('stats')
R = ro.r
df = pd.DataFrame(data={'subject':['1','2','3','4','5','1','2','3','4','5'],'group':['1','1','1','2','2','1','1','1','2','2'],'session':['1','1','1','1','1','2','2','2','2','2'],'covar':['1', '2', '0', '2', '1', '1', '2', '0', '2', '1'],'result':[-6.77,6.11,5.67,-7.679,-0.0930,0.948,2.99,6.93,6.30,9.98]})

rdf=pandas2ri.py2ri(df)
result=stats.lm('result ~ group * session + covar',data=rdf)
print(R.summary(result).rx2('coefficients'))

工作正常:

                 Estimate Std. Error    t value  Pr(>|t|)
(Intercept)      5.323667   4.458438  1.1940654 0.2984217
group2          -3.729167   5.227982 -0.7133090 0.5150618
session2         1.952667   4.458438  0.4379710 0.6840198
covar1          -5.937500   5.107783 -1.1624418 0.3096835
covar2          -5.023500   5.107783 -0.9834992 0.3810438
group2:session2 10.073333   7.049410  1.4289612 0.2262206

我还检查了我的混合效果模型是否在R中正常工作:

df <- read.table(header=T, con <- textConnection('
covar group  result session subject
1     1  -6.770       1       1
2     1   6.110       1       2
0     1   5.670       1       3
2     2  -7.679       1       4
1     2  -0.093       1       5
1     1   0.948       2       1
2     1   2.990       2       2
0     1   6.930       2       3
2     2   6.300       2       4
1     2   9.980       2       5'))
close(con)

mixed <- aov(result ~ group*session + covar + Error(as.factor(subject)/session),data=df)
summary(mixed)

这似乎也有效:

Error: as.factor(subject)
          Df Sum Sq Mean Sq F value Pr(>F)
group      1   0.65    0.65   0.012  0.924
covar      1  16.68   16.68   0.301  0.638
Residuals  2 110.76   55.38               

Error: as.factor(subject):session
              Df Sum Sq Mean Sq F value Pr(>F)  
session        1  89.46   89.46   8.002 0.0663 .
group:session  1  60.88   60.88   5.446 0.1018  
Residuals      3  33.54   11.18                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

问:为什么混合效果模型不适用于此?

result2=stats.aov('result ~ group*session + covar + Error(as.factor(subject)/session)',data=rdf)
print(R.summary(result2).rx2('coefficients'))

这是错误消息:

//anaconda/lib/python2.7/site-packages/rpy2/rinterface/__init__.py:185: RRuntimeWarning: Error: $ operator is invalid for atomic vectors

  warnings.warn(x, RRuntimeWarning)
---------------------------------------------------------------------------
RRuntimeError                             Traceback (most recent call last)
<ipython-input-2-aab76c72fbf3> in <module>()
----> 1 result2=stats.aov('result ~ group*session + covar + Error(as.factor(subject)/session)',data=rdf)
      2 print(R.summary(result2).rx2('coefficients'))

//anaconda/lib/python2.7/site-packages/rpy2/robjects/functions.pyc in __call__(self, *args, **kwargs)
    176                 v = kwargs.pop(k)
    177                 kwargs[r_k] = v
--> 178         return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
    179
    180 pattern_link = re.compile(r'\\link\{(.+?)\}')

//anaconda/lib/python2.7/site-packages/rpy2/robjects/functions.pyc in __call__(self, *args, **kwargs)
    104         for k, v in kwargs.items():
    105             new_kwargs[k] = conversion.py2ri(v)
--> 106         res = super(Function, self).__call__(*new_args, **new_kwargs)
    107         res = conversion.ri2ro(res)
    108         return res

RRuntimeError: Error: $ operator is invalid for atomic vectors

我使用以下帖子作为指导:

rpy2 - Minimal example of rpy2 regression using pandas data frame

R中的混合ANOVA - https://stats.stackexchange.com/questions/45264/why-does-a-mixed-design-using-rs-aov-need-the-between-subject-factors-specific

1 个答案:

答案 0 :(得分:1)

[投票只是因为你有一个很好的小而独立的例子。]

与rpy2相关的R等价物如下(并返回相同的错误)

> mixed <- aov("result ~ group*session + covar + Error(as.factor(subject)/session)",data=df)
Error: $ operator is invalid for atomic vectors

公式对象与字符串不同。

> class(y ~ x)
[1] "formula"
> class("y ~ x")
[1] "character"

rpy2有一个构造函数,用于从Python字符串构建R公式:

from rpy2.robjects import Formula
fml = Formula("y ~ x")

将此传递给aov()而不是字符串。