我正在尝试使用以下代码通过rpy2将SPSS文件导入python。
from rpy2.robjects import pandas2ri, r
filename = 'x.sav'
df = r('foreign::read.spss("%s", reencode = "cp1252", use.value.labels=TRUE, to.data.frame = TRUE)' % filename)
我的文件中缺少要保留的值。 例如:
df[361]
#Returns
R object with classes: ('factor',) mapped to:
<FactorVector - Python:0x0000000011657CC8 / R:0x0000000012DAE738>
[NA_integer_, NA_integer_, NA_integer_, NA_integer_, ..., NA_integer_, NA_integer_, NA_integer_, NA_integer_]
print(df[361])
#Returns
[1] <NA> <NA> <NA> <NA>
[2] <NA> <NA> <NA> <NA>
[3] <NA> <NA> <NA> <NA>
[4] Agree (2) Agree (2) <NA> <NA>
[5] <NA> <NA> <NA> <NA>
5 Levels: Strongly Agree (1) Agree (2) ... Strongly Disagree (5)
但是,当我执行pandas2ri.ri2py转换时,它将所有NA_interger_更改为整数值1,并接在值标签后面。
df = pandas2ri.ri2py(df)
df.iloc[:, 361]
#Returns
1 Strongly Agree (1)
2 Strongly Agree (1)
3 Strongly Agree (1)
4 Strongly Agree (1)
...
13 Agree (2)
14 Agree (2)
...
19 Strongly Agree (1)
20 Strongly Agree (1)
如何停止阻止pandas2ri.ri2py()自动转换这些NA_Integers_?