我想使用 rpy2 在python中运行R脚本,我已经知道该怎么做
R代码是:
dataR = data.frame( Ingresos = c(23,45,24,23,54),
Bonos = c(23,45,12,67,54),
Deuda = c(23,4,1,6,3),
row.names = c("Nathy", "Tomas", "Joe", "Emily", "Javi") )
dataR
promedio_ingresos = mean(dataR$Ingresos)
Max_Ing = sort(dataR$Ingresos[dataR$Ingresos>promedio_ingresos])
Max_Ing
要在python中运行此R脚本,请使用:
import rpy2
from rpy2.robjects.packages import importr
import rpy2.robjects as robjects
r = robjects.r
output = r.source("R_script_run_in_python.R")
output
输出从我的R代码中获取最后一个值
现在,我想运行相同的代码,但是要使用我在python中定义的数据,例如:
import pandas as pd
df = pd.DataFrame( np.random.randn(5,3),
columns = ["Ingresos","Bonos","Deuda"],
index = ["Max", "Nathy", "Tom", "Joe", "Kathy"] )
所以我现在想运行的R代码就是:
promedio_ingresos = mean(dataR$Ingresos)
Max_Ing = sort(dataR$Ingresos[dataR$Ingresos>promedio_ingresos])
Max_Ing
但是dataR是df,我该怎么办?
答案 0 :(得分:0)
我尝试了这个,并且奏效了
# Data
# Pandas dataframe
df = pd.DataFrame( np.random.randn(5,3),
columns = ["Ingresos","Bonos","Deuda"],
index = ["Max", "Nathy", "Tom", "Joe", "Kathy"] )
# rpy2 datframe
dataR = pandas2ri.py2ri(df)
# R code
robjects.globalenv["dataR"] = dataR
robjects.r('''
promedio_ingresos = mean(dataR$Ingresos)
Max_Ing = sort(dataR$Ingresos[dataR$Ingresos>promedio_ingresos])
''')
print(robjects.globalenv["dataR"])
print(robjects.globalenv["promedio_ingresos"])
print(robjects.globalenv["Max_Ing"])