在Pandas中的DataFrame中替换string或int值时的TypeError

时间:2017-08-10 19:21:10

标签: python pandas replace typeerror

我正在使用以下代码替换数据框列中的字符串值。这些值正在从字典中的值中替换。代码如下:

def ConvertColumnValuesToDict(df, colName):
    uniqueValues = df[colName].unique()
    colName2Num = {}
    num2ColName = {}
    count = 0
    for value in uniqueValues:
        colName2Num[str(value)] = str(count)
        num2ColName[str(count)] = str(value)
        count = count + 1

    return colName2Num, num2ColName

artifactId2Num, num2ArtifactId = ConvertColumnValuesToDict(Sessions, 'artifact_id')
Sessions['artifact_id'] = Sessions['artifact_id'].astype(str)
Sessions['artifact_id'].replace(artifactId2Num, inplace=True)

最终的replace代码会引发以下错误:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-08fcc844456d> in <module>()
      1 # Sessions['artifact_id'] = Sessions['artifact_id'].astype(int)
----> 2 Sessions['artifact_id'].replace(artifactId2Num, inplace=True)

/home/prateek/anaconda2/lib/python2.7/site-packages/pandas/core/generic.pyc in replace(self, to_replace, value, inplace, limit, regex, method, axis)
   3723 
   3724             return self.replace(to_replace, value, inplace=inplace,
-> 3725                                 limit=limit, regex=regex)
   3726         else:
   3727 

/home/prateek/anaconda2/lib/python2.7/site-packages/pandas/core/generic.pyc in replace(self, to_replace, value, inplace, limit, regex, method, axis)
   3772                                                        dest_list=value,
   3773                                                        inplace=inplace,
-> 3774                                                        regex=regex)
   3775 
   3776                 else:  # [NA, ''] -> 0

/home/prateek/anaconda2/lib/python2.7/site-packages/pandas/core/internals.pyc in replace_list(self, src_list, dest_list, inplace, regex, mgr)
   3257             return block, val
   3258 
-> 3259         masks = [comp(s) for i, s in enumerate(src_list)]
   3260 
   3261         result_blocks = []

/home/prateek/anaconda2/lib/python2.7/site-packages/pandas/core/internals.pyc in comp(s)
   3245             if isnull(s):
   3246                 return isnull(values)
-> 3247             return _maybe_compare(values, getattr(s, 'asm8', s), operator.eq)
   3248 
   3249         def _cast_scalar(block, scalar):

/home/prateek/anaconda2/lib/python2.7/site-packages/pandas/core/internals.pyc in _maybe_compare(a, b, op)
   4617             type_names[1] = 'ndarray(dtype=%s)' % b.dtype
   4618 
-> 4619         raise TypeError("Cannot compare types %r and %r" % tuple(type_names))
   4620     return result
   4621 

TypeError: Cannot compare types 'ndarray(dtype=object)' and 'str'

注意:我在外部将字典中的键和值转换为代码中定义的函数中的str。我尝试了另一种变体,其中我没有进行外部转换,并让这些值为int。那是我收到以下错误的时候:

TypeError: Cannot compare types 'ndarray(dtype=int64)' and 'int64'

这是我的样本数据集:

Session_id    artifact_id
    A              234
    A              123
    B              123
    B              678

字典的内容如下:

{'1':'234','2':'123','3':'678'}

我希望最终数据集看起来像这样:

Session_id    artifact_id
    A              1
    A              2
    B              2
    B              3

我该如何解决这个问题?

0 个答案:

没有答案