在R脚本(rpy2)中使用python中的数据在python中运行R脚本

时间:2018-12-04 18:42:57

标签: python r rpy2

我想使用 rpy2 在python中运行R脚本,我已经知道该怎么做

R代码是:

dataR = data.frame( Ingresos = c(23,45,24,23,54),
                    Bonos = c(23,45,12,67,54),
                    Deuda = c(23,4,1,6,3),
                    row.names = c("Nathy", "Tomas", "Joe", "Emily", "Javi") )
dataR
promedio_ingresos = mean(dataR$Ingresos)
Max_Ing = sort(dataR$Ingresos[dataR$Ingresos>promedio_ingresos])
Max_Ing

要在python中运行此R脚本,请使用:

import rpy2
from rpy2.robjects.packages import importr
import rpy2.robjects as robjects
r = robjects.r
output = r.source("R_script_run_in_python.R")
output

输出从我的R代码中获取最后一个值

现在,我想运行相同的代码,但是要使用我在python中定义的数据,例如:

import pandas as pd
df = pd.DataFrame( np.random.randn(5,3), 
                   columns = ["Ingresos","Bonos","Deuda"], 
                   index = ["Max", "Nathy", "Tom", "Joe", "Kathy"] )

所以我现在想运行的R代码就是:

promedio_ingresos = mean(dataR$Ingresos)
Max_Ing = sort(dataR$Ingresos[dataR$Ingresos>promedio_ingresos])
Max_Ing

但是dataR是df,我该怎么办?

1 个答案:

答案 0 :(得分:0)

我尝试了这个,并且奏效了

# Data    
# Pandas dataframe
df = pd.DataFrame( np.random.randn(5,3),
                   columns = ["Ingresos","Bonos","Deuda"],
                   index = ["Max", "Nathy", "Tom", "Joe", "Kathy"] )   
# rpy2 datframe
dataR = pandas2ri.py2ri(df)

# R code
robjects.globalenv["dataR"] = dataR
robjects.r('''
           promedio_ingresos = mean(dataR$Ingresos)
           Max_Ing = sort(dataR$Ingresos[dataR$Ingresos>promedio_ingresos])
''')
print(robjects.globalenv["dataR"])
print(robjects.globalenv["promedio_ingresos"])
print(robjects.globalenv["Max_Ing"])