将R函数输出读为列

时间:2015-02-24 13:50:45

标签: python r rpy2

我试图找到解决这个问题的方法,我昨天问过:

rpy2 fails to import 'rgl' R package

我的目标是检查Rpython内是否安装了某些软件包。

根据对Dirk Eddelbuettel的评论中提到的his answer的建议,我使用installed.packages()中的R函数列出了所有可用的包。< / p>

这是我到目前为止所得到的:

from rpy2.rinterface import RRuntimeError
from rpy2.robjects.packages import importr
utils = importr('utils')

def importr_tryhard(packname, contriburl):
    try:
        rpack = utils.installed_packages()
    except RRuntimeError:
        rpack = []
    return rpack

contriburl = 'http://cran.stat.ucla.edu/'
rpack = importr_tryhard(packname, contriburl)
print rpack

返回表单的相当大的输出:

           Package      LibPath                         Version   
ks         "ks"         "/usr/local/lib/R/site-library" "1.8.13"  
misc3d     "misc3d"     "/usr/local/lib/R/site-library" "0.8-4"   
mvtnorm    "mvtnorm"    "/usr/local/lib/R/site-library" "0.9-9996"
rgl        "rgl"        "/usr/local/lib/R/site-library" "0.93.986"
base       "base"       "/usr/lib/R/library"            "3.0.1"   
boot       "boot"       "/usr/lib/R/library"            "1.3-9"   
class      "class"      "/usr/lib/R/library"            "7.3-9"   
cluster    "cluster"    "/usr/lib/R/library"            "1.14.4"  
codetools  "codetools"  "/usr/lib/R/library"            "0.2-8"   
compiler   "compiler"   "/usr/lib/R/library"            "3.0.1"   
datasets   "datasets"   "/usr/lib/R/library"            "3.0.1"   
foreign    "foreign"    "/usr/lib/R/library"            "0.8-49"  
graphics   "graphics"   "/usr/lib/R/library"            "3.0.1"   
grDevices  "grDevices"  "/usr/lib/R/library"            "3.0.1"   
grid       "grid"       "/usr/lib/R/library"            "3.0.1"   
KernSmooth "KernSmooth" "/usr/lib/R/library"            "2.23-10" 
lattice    "lattice"    "/usr/lib/R/library"            "0.20-23" 
MASS       "MASS"       "/usr/lib/R/library"            "7.3-29"  
Matrix     "Matrix"     "/usr/lib/R/library"            "1.0-14"  
methods    "methods"    "/usr/lib/R/library"            "3.0.1"   
mgcv       "mgcv"       "/usr/lib/R/library"            "1.7-26"  
nlme       "nlme"       "/usr/lib/R/library"            "3.1-111" 
nnet       "nnet"       "/usr/lib/R/library"            "7.3-7"   
parallel   "parallel"   "/usr/lib/R/library"            "3.0.1"   
rpart      "rpart"      "/usr/lib/R/library"            "4.1-3"   
spatial    "spatial"    "/usr/lib/R/library"            "7.3-6"   
splines    "splines"    "/usr/lib/R/library"            "3.0.1"   
stats      "stats"      "/usr/lib/R/library"            "3.0.1"   
stats4     "stats4"     "/usr/lib/R/library"            "3.0.1"   
survival   "survival"   "/usr/lib/R/library"            "2.37-4"  
tcltk      "tcltk"      "/usr/lib/R/library"            "3.0.1"   
tools      "tools"      "/usr/lib/R/library"            "3.0.1"   
utils      "utils"      "/usr/lib/R/library"            "3.0.1"   
           Priority     
ks         NA           
misc3d     NA           
mvtnorm    NA           
rgl        NA           
base       "base"       
boot       "recommended"
class      "recommended"
cluster    "recommended"
...

我只需要提取已安装软件包的名称,因此第一列或第二列对我来说就足够了。

我已尝试使用np.loadtxt()np.genfromtxt()with open(rpack) as csvfile:,但没有人能够返回列或数组,其中列或行已正确分隔(他们都失败了,实际上有不同的错误。)

我如何以列形式读取此输出,或者更重要的是,在列表/数组中提取已安装软件包的名称?

2 个答案:

答案 0 :(得分:1)

之前我没有使用r2py,但它看起来像某种r2py对象,并且可能只有抓住第一列。< / p>

你可以像文本文件一样解析它;当你调用print XXX时,它会抓取对象的字符串表示。

尝试做这样的事情:

s = str(rpack)
packages = [line.split()[0] for line in s.split("\n")[1:]]

您应该尝试使用strrepr方法来获取字符串表示,但有些人不会同时使用它们,或者以不同方式使用它们。

虽然这并不是最干净的方法,但您必须确保正确解析数据。尝试打印dir(rpack)并查看是否有任何属性,其中包含他们想要包含的内容。

一点点挖掘,installed_pa​​ckages文档以及对R教程的快速浏览表明你可以这样做:

print mpack[,"Package"]

答案 1 :(得分:1)

在您的情况下,

rpackrpy2.robjects.vectors.Matrix对象。因此,您只需使用rpy2类方法.rx()来提取列:

mylist = list(rpack.rx(True, 1))

试一试。