I am trying to run some R code from within a python script. To do this I am using rpy2 but having difficulty (could also just call an R script but I can't get that to work either). Below is the R script code that does what I want it to do:
library(ggplot2)
setwd("/dir/")
allelefreqshort <- read.table("allelefreqsshort.txt", header = TRUE)
hist(log10(allelefreqshort$AlleleFreq), xlim = c(-15,0), breaks=20)
This is my rpy2 code, and it plots the data but not as log10 and also the x axis is too small.
import rpy2.robjects as ro
from rpy2.robjects.packages import importr
r = ro.r
outputDir = '/dir'
r.setwd(outputDir)
f = r('read.table("allelefreqs.txt", header = FALSE)')
grdevices = importr('grDevices')
grdevices.png(file="alleleFreq.png", width=800, height=500)
r.hist(f[0], breaks=100, main = '5 Reads', xlab='Variant Freq', ylab='# Vars', log10='x')
grdevices.dev_off()
答案 0 :(得分:1)
这里f
是一个R数据框,它实质上是指一个R列表,其中所有元素都是R向量(每个都是&#34;列&#34;在你的表中)并且所有这些向量都有相同的长度。
执行f[0]
会返回长度为1的列表,因为这就是R会做的事情(R有[
和[[
- 到目前为止我还不确定是否{{}在Python方面,1}}应该像R&#39; s [
或[
,但对于另一个线程。 [[
将返回您想要的内容(请参阅http://rpy2.readthedocs.org/en/version_2.7.x/vector.html#extracting-elements处的rpy2和数据框的文档,并注意R序列是1偏移量,而Python向量是0偏移量。)
答案 1 :(得分:1)
您可以保持R命令完整无缺,但以不同方式处理Python之间的对象。请考虑以下rpy2和Rscript命令行解决方案:
<强> RPY2 强>
import os
import rpy2
import rpy2.robjects as ro
from rpy2.robjects.packages import importr
# CURRENT DIRECTORY OF SCRIPT
cd = os.path.dirname(os.path.abspath(__file__))
# READ IN DATA
allelefreqshort_py = ro.r['read.table'](os.path.join(cd, "allelefreqs.txt"), header=False)
# PASSING PYTHON DF TO R DF
ro.globalenv['allelefreqshort'] = allelefreqshort_py
# OUTPUT PLOT
grdevices = importr('grDevices')
grdevices.png(file="alleleFreq.png", width=800, height=500)
p = ro.r('hist(log10(allelefreqshort$AlleleFreq), xlim = c(-15,0), breaks=20)')
grdevices.dev_off()
<强> RSCRIPT 强>
或者,您可以使用R的自动化可执行文件RScript.exe运行子进程并通过命令行调用R脚本。您甚至可以将参数传递给R脚本以供R使用commandArgs()
。
import subprocess
# CURRENT DIRECTORY OF SCRIPT (ASSUMING R SCRIPT IN SAME DIRECTORY)
cd = os.path.dirname(os.path.abspath(__file__))
# COMMAND LINE ARGUMENTS (IF RSCRIPT.EXE IS PATH VARIABLE, LEAVE OUT DIRECTORY)
cmd = ["path/to/RScript", os.path.join(cd, "HistPlotScriptName.R")]
# SUBPROCESS CALL
a = subprocess.Popen(cmd, shell=True, stdin=subprocess.PIPE,
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
output,error = a.communicate()
# R CONSOLE OUTPUT PRINTED TO PYTHON CONSOLE
if a.returncode == 0: # SUBPROCESS SUCCESSFUL
print(output.decode("utf-8"))
else: # SUBPROCESS FAILED
print(error.decode("utf-8"))