将Pandas Dataframe列转换为R因子

时间:2016-04-15 06:08:01

标签: python r pandas rpy2

我正在尝试将pandas数据帧的列转换为因子,因为我试图在R中调用的函数需要因子。

pandas2ri.activate()    
#second column of labels has to be converted to factors
labels = read_csv(path_to_csv)
as_factor = ro.r['as.factor']
output = package.function(another_df, as_factor(labels['column_name']))

以下是我得到的错误:

rpy2.rinterface.RRuntimeError: Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

我该怎么办?

下面的可重复示例:

import pandas as pd

df = pd.DataFrame({'Col': [10, 20],
                   'x': ['Control', 'Low_Cav02']})

from rpy2 import robjects as ro

from rpy2.robjects import pandas2ri
pandas2ri.activate()

as_factor = ro.r['as.factor']

labels = as_factor(df['Col'])
print labels

labels = as_factor(df['x'])
print labels

输出:

[1] 10 20
Levels: 10 20

/Users/swetabh/Envs/damet/lib/python2.7/site-packages/rpy2/robjects/functions.py:106: UserWarning: Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

  res = super(Function, self).__call__(*new_args, **new_kwargs)
Traceback (most recent call last):
  File "damet/analysis.py", line 26, in <module>
    labels = as_factor(df['x'])
  File "/Users/swetabh/Envs/damet/lib/python2.7/site-packages/rpy2/robjects/functions.py", line 178, in __call__
    return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
  File "/Users/swetabh/Envs/damet/lib/python2.7/site-packages/rpy2/robjects/functions.py", line 106, in __call__
    res = super(Function, self).__call__(*new_args, **new_kwargs)
rpy2.rinterface.RRuntimeError: Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

1 个答案:

答案 0 :(得分:1)

我的工作正常。您使用的是哪个版本的rpy2

编辑:下面的原始回答 - 我误解了这个问题

如果尝试创建R DataFrame,rpy2中的默认转换器会将Python列表转换为R列表。 如果你想要一个R向量,可以使用向量的构造函数。

您的示例可能如下所示:

df = ro.DataFrame({'Col': ro.vectors.IntVector([10, 20]),
                   'x': ro.vectors.StrVector(['Control', 'Low_Cav02'])})