rpy2中的问题 - ri2py错误地转换NA

时间:2017-04-27 22:06:36

标签: python r pandas rpy2

我试图通过Python连接一些R代码,但是将数据转换回pandas对象并不能正确处理NA值。

示例R代码:

dummy_call_method1 <- function(argument) {
    col_a <- c("A", "A", "B", "B")
    col_b <- c(1, NA, 11, 12)
    return(data.frame(col_a, col_b))
}

dummy_call_method2 <- function(argument) {
    col_a <- c("A", "A", "B", "B")
    col_b <- c("one", NA, "eleven", "twelve")
    return(data.frame(col_a, col_b))
}

示例Python代码:

import os
import rpy2
from rpy2 import rinterface, robjects
from rpy2.robjects import pandas2ri

def r_source(base_dir, filename):
    r_script = os.path.join(base_dir, filename)
    r_src = rpy2.robjects.r['source']
    r_src(r_script)

def r_call_function(func_name, *args):
    func = rpy2.robjects.r[func_name]
    result = func(*args)
    return result

r_source('~/workspace/', 'test.R')

dummy_results1 = r_call_function("dummy_call_method1", "")
dummy_results2 = r_call_function("dummy_call_method2", "")

print dummy_results1
print rpy2.robjects.pandas2ri.ri2py(dummy_results)
print dummy_results2
print rpy2.robjects.pandas2ri.ri2py(dummy_results2)

我希望两次调用ri2py分别用None和NaN替换虚拟调用中的NA值。然而,虽然后者正如预期的那样工作,但前者正在用#34; Eleven&#34;出于某种原因 - 我不知道它是否在未初始化的指针中读取或是什么。

这是输出,注意到意外行为:

  col_a    col_b
1     A        1
2     A       NA
3     B       11
4     B       12

  col_a     col_b
1     A       1.0
2     A       NaN
3     B      11.0
4     B      12.0

  col_a     col_b
1     A       one
2     A      <NA>
3     B    eleven
4     B    twelve

  col_a      col_b
1     A        one
2     A     eleven     #This is incorrect
3     B     eleven
4     B     twelve

0 个答案:

没有答案