获取此类型错误:期望的字符串或类似字节的对象

时间:2017-05-24 05:36:00

标签: python pandas

代码:

import pandas as pd
import numpy as np
import re

df=pd.read_csv('twitDB.csv',header=None, sep=',',error_bad_lines=False,encoding='utf-8')

hula=df[[0,1,2,3]]
hula=hula.fillna(0)
hula['tweet'] = hula[0].astype(str) +hula[1].astype(str)+hula[2].astype(str)+hula[3].astype(str) 
dhole=hula["tweet"]


dhole = re.sub('\s+', ' ',dhole )

抓住这个

  

错误:预期的字符串或类似字节的对象

1 个答案:

答案 0 :(得分:1)

我认为您需要Series.replaceSeries.str.replace,因为使用Series(数组)和re.sub适用于标量:

dhole = dhole.replace('\s+', ' ', regex=True)
#or
dhole = dhole.str.replace('\s+', ' ')

样品:

>>> hula = pd.DataFrame({'tweet':['ss      ddd s   ss','d         d','f       t       y']})
>>> dhole=hula["tweet"]
>>> print (dhole)
0    ss      ddd s   ss
1           d         d
2     f       t       y
Name: tweet, dtype: object

>>> dhole = dhole.replace('\s+', ' ', regex=True)
>>> print (dhole)
0    ss ddd s ss
1            d d
2          f t y
Name: tweet, dtype: object
>>> dhole = dhole.str.replace('\s+', ' ')
>>> print (dhole)
0    ss ddd s ss
1            d d
2          f t y
Name: tweet, dtype: object